Google is Lightyears Ahead in AI

Google’s Deepmind is incredible. In 2016, AlphaGo (one of the Google Deepmind projects) bested the world champion Go player, Lee Sedol in the 2500-year-old Chinese game that many AI experts thought would not be cracked for at least another decade.

AlphaGo won that match against Sedol 4-1, smashing the belief of experts and Go fanatics around the world who knew that a computer couldn’t yet beat a human. There’s a great Netflix documentary on the feat that chronicles the AlphaGo team’s quest to defeat the grandmasters of the ancient game. It reminded me of when I was younger and I watched IBM’s Deep Blue defeat the Grandmaster, Garry Kasparov at Chess, except this was much scarier.

Deep Blue was able to defeat the best Chess players in the world because chess is a game with a relatively small number of possible moves each turn and a computer can essentially ‘brute force’ calculate all the potential moves that a player could make, and easily counter them. Go is a very different game. With an average of 200 potential moves every turn, the number of possible configurations of a Go board quickly exceeds the number of atoms in the entire universe. There are approximately 10^{78} atoms in the universe. A game of Go can last up to 2.08*10^{170} moves, and at each point has at most 361 legal moves. From the lower bound, the number of possible go games is at least {10^{10}}^{48} , which is much, much larger than the number of atoms in the observable universe. This means that brute-forcing the outcome of a Go match is essentially impossible, even with the most powerful supercomputers in the world at your disposal.

In order to surmount this problem, AlphaGo combines three cutting-edge techniques to predict outcomes and play the game better than any human. Alpha Go combines two neural networks – one conducting deep reinforcement learning and the other using a value network to predict the probability of a favourable outcome, with a Monte Carlo search algorithm that works similarly to how Deep Blue worked. The deep reinforcement learning piece works by having the system play itself over and over again improving the neural network and optimizing the system to win the game by any margin necessary. The value network was then trained on 30 million game positions that the system experienced while playing itself to predict the probability of a positive outcome. Finally, the tree search procedure uses an evaluation function to give a ‘tree’ of possible game moves and select the best one based on the likely outcomes.

The architecture of the system is impressive advanced and its ability to beat humans is impressive, however, the most amazing moment in the Netflix documentary comes when the world champion, Sedol, realizes that Alpha Go is making moves that are so incredible, so creative, that he has never seen a human player even think of playing those moves.

Since beating the 9-dan professional two years ago, Google has since iterated its AlphaGo platform with 2 new generations. The latest generation AlphaGo Zero, beat the iteration of AlphaGo that defeated humanity’s best Go player by a margin of 100 – 0. That’s right, the newest version of AlphaGo destroyed the version of AlphaGo that beat the best human player 100 times over. The craziest thing is, the new version of AlphaGo was not given any directions except for the basic rules of the game. It simply played itself over and over again, millions of times, until it knew how to play better than any human or machine ever created.

Courtesy Google’s DeepMind Blog

This awesome video by Two Minute Papers talks about how Google’s Deep mind has iterated over the past few years and how AlphaGo is now exponentially better than the smartest human players and is trained and runs on hardware equivalent to an average laptop.

Courtesy Google’s DeepMind Blog

It is scary and incredible how fast DeepMind is advancing in multiple fields. Right now it is still considered a narrow AI (i.e. it doesn’t perform or learn well in areas outside of its primary domain), however, the same algorithms are being applied to best the greatest humans in every area from medicine to video gaming. In my opinion, it is only a matter of a few years before this system will be able to beat humans at nearly everything that we think we do well right now. We can only hope that Google built in the appropriate safeguards to protect us from its own software.

If you want to learn more about how our future robot overlords think, there’s no better way to get started than by racing an autonomous robocar around a track. Come check out @DIYRobocarsCanada on Instagram and join us on our Meetup group to get involved today!