DeepMind’s AlphaStar AI bot has reached Grandmaster level at StarCraft II, a popular battle strategy computer game, after ranking within the top 0.15 per cent of players in an online league.
StarCraft II is a complex game and has a massive following with its own annual professional tournament - StarCraft II World Championship Series - that involves the best international teams competing over a prize pot over $2m.
AlphaStar, however, isn’t quite good enough to compete in that competition. Instead it set its eyes on a much smaller contest on Battle.net, the game’s official online league hosted by China-friendly gaming biz Blizzard Entertainment.
Researchers at Google-stablemate DeepMind entered its bot AlphaStar into a series of blind games, where its opponents had no idea it was playing against a computer. Three neural networks were trained to play a series of 1V1 matches as each species in the game. There were also three variants of AlphaStar too: AlphaStar Supervised, AlphaStar Mid, AlphaStar Final.
“The supervised agent was rated in the top 16 per cent of human players, the midpoint agent within the top 0.5 per cent, and the final agent, on average, within the top 0.15 per cent, achieving a Grandmaster level rating for all three races,” according to the results published in a paper in Nature this week.
AlphaStar may have achieved Grandmaster status, but at what cost?
AlphaStar Final performed the best out of them all, and was ranked above 99.8 per cent of amateur human players in the Battle.net league. Although the online competition contains about 90,000 players on the European region alone, AlphaStar did not play against every single person.
Instead, AlphaStar Supervised played a total of 90 games and AlphaStar Mid played 180 games. The performance for AlphaStar Final, however, was not calculated from scratch and instead picked up from where AlphaStar Mid left off and after it had played an additional 90 games on top.
StarCraft is impossibly hard for a computer to master using machine learning techniques alone. There are up to 1026 possible actions that a bot can take at each step of the game. So for that reason, its fed some prior knowledge learnt from observing human game-play during the training process.
AlphaStar also has another advantage over humans. Since practice makes (nearly) perfect, the bot played millions of games to rack up an experience longer than a human lifetime to become good at the game.
It learned to play the game by imitating human strategies and playing against multiple versions of itself using a technique known as self-play. For that reason, the bot struggles to come up with novel strategies of its own and whilst it’s a solid player, it’s not very robust against tactics it hasn’t come up against before.
Teaching a computer to play StarCraft is incredibly computationally intensive and requires a ridiculous amount of resources. DeepMind needed 384 Google TPU v3 math accelerators, and since each unit contains eight cores, that’s a whopping 3,072 cores in total over 44 days of training time. Under the web giant's current cloud pricing, it costs $8 to rent out a single TPU v3 per hour.
Human StarCraft II e-athletes crushed by neural net ace – DeepMind's AlphaStarREAD MORE
So, in theory, running 384 TPU v3 chips over the course of 44 days straight would, for you and me, rack up a computing bill of $3,244,032 – a price that only very few AI research labs can afford. No doubt DeepMind got a steep discount.
DeepMind reckons the whole effort is worth it, however, as teaching machine-learning models to master a difficult game like StarCraft could help computers in real-world scenarios, where they have to make use of “limited information to make dynamic and difficult decisions that have ramifications on multiple levels and timescales.”
“The techniques we used to develop AlphaStar will help further the safety and robustness of AI systems in general, and, we hope, may serve to advance our research in real-world domains,” it said in a statement.
We've yet to see any research or evidence that the strategies learned from a domain like StarCraft can be applied in the real world, though.
You can watch AlphaStar in action here. ®