Checkmate: DeepMind's AlphaZero AI clobbered rival chess app on non-level playing, er, board

Good effort but the games were seemingly rigged


Analysis DeepMind claimed this month its latest AI system – AlphaZero – mastered chess and Shogi as well as Go to "superhuman levels" within a handful of hours.

Sounds impressive, and to an extent it is. However, some things are too good to be completely true. Now experts are questioning AlphaZero's level of success.

AlphaZero is based on AlphaGo, the machine-learning software that beat 18-time Go champion Lee Sedol last year, and AlphaGo Zero, an upgraded version of AlphaGo that beat AlphaGo 100-0.

Like AlphaGo Zero, AlphaZero learned to play games by playing against itself, a technique in reinforcement learning known as self-play.

“Starting from random play, and given no domain knowledge except the game rules, AlphaZero achieved within 24 hours a superhuman level of play in the games of chess and shogi (Japanese chess) as well as Go, and convincingly defeated a world-champion program in each case,” DeepMind's research team wrote in a paper detailing AlphaZero's design.

AlphaZero faced Stockfish, a chess-playing AI program that won the Top Chess Engine Championship (TCEC) last year. AlphaZero won 28 games of chess, drew 72, and lost none against Stockfish.

Shogi, a Japanese strategy game similar to chess, is more complex. Here, AlphaZero won against Elmo, a Shogi computer engine, in 90 games, drew twice, and lost 10 matches.

The rules of the two board games were provided to AlphaZero, and the system learned how to master them both over the course of 68 million self-play matches against itself. To put it another way, AlphaZero took four hours to grasp chess to a level where it could beat Stockfish, spending nine hours totals on the game format – and took less than two hours to master Shogi to the point where it could see off Elmo. AlphaZero also creamed DeepMind's Go-playing AI AlphaGo Lee after eight hours of training.

It’s an impressive feat – but one that was achieved by carefully manipulating the experiment, Jose Camacho Collados, an AI researcher and an international chess master, argued in an analysis this week.

Sorry to burst your bubble, but Microsoft's 'Ms Pac-Man beating AI' is more Automatic Idiot

READ MORE

Firstly, DeepMind is part of Google-parent Alphabet, and thus has access to massive computing power. AlphaZero was trained on 64 TPU2s – the second generation of Google’s TPU accelerator chip – and a whopping 5,000 first-generation TPUs to generate self-play games from which AlphaZero played from.

That means, as Camacho Collados pointed out, the time spent training AlphaZero per TPU is roughly two years. In contrast to that processing power, Stockfish and Elmo, were only given 64 x86 CPU threads and a hash size of 1GB, meaning that both game engines were not on equal footing to begin with.

AlphaZero ran on math-crunching hardware dedicated to neural networks, while its opponents ran on PCs. Think supercar versus a Ford Focus.

“The experimental setting does not seem fair,” Camacho Collados said. “The version of Stockfish used was not the last one but, more importantly, it was run in its released version run on a normal PC, while AlphaZero was ran using considerable higher processing power. For example, in the TCEC competition engines play against each other using the same processor.”

Next, DeepMind's paper stated that both systems, AlphaZero and Stockfish, were given one minute to make a move. That is highly unorthodox for tournament play. As everyone knows, in a chess match, players are typically given a bank of time in which to make all their moves, not a countdown per move. For example, the World Chess Federation gives players "90 minutes for the first 40 moves followed by 30 minutes for the rest of the game with an addition of 30 seconds per move starting from move one."

That means some actions, such as early moves, can be performed quickly, giving yourself more time – more than a minute if needed – to perform later-stage maneuvers. Stockfish was designed to play chess like normal over a period of time rather than against a minute-long shot clock.

AlphaZero, on the other hand, was optimized for minute-to-minute play. The neural network took the positions on the board as input, and spat out a range of moves and chose the one with the highest chance of winning at every move. It learned this by self-play and using a Monte Carlo tree search algorithm to sort through the potential strategies.

Camacho Collados noted:

The selection of the time seems odd. Each engine was given one minute per move. However, in the vast majority of human and engine competitions each player is given a fixed amount of time for the whole game, and then this time is administered individually. As Tord Romstad, one of the original developers of Stockfish, declared, this was another questionable decision in detriment of Stockfish, as “lot of effort has been put into making Stockfish identify critical points in the game and decide when to spend some extra time on a move.”

The decision to go with one-minute timeouts, as well as under-powering its competitors, seems awfully convenient for DeepMind.

It’s also difficult to really scrutinize AlphaZero since DeepMind have not released the code publicly for any of its game-playing systems. It’s impossible to test any claims made, and to check if the results are reproducible.

In the paper, ten games played between AlphaZero and Stockfish were cherry-picked by the researchers to show AlphaZero winning. The losses it faced against Elmo in Shogi have not been published, so it’s impossible to see where the software was inferior.

“It is customary in scientific papers to show examples on which the proposed system displays some weaknesses or may not behave as well in order to have a more global understanding and for other researchers to build upon it,” Collados wrote.

“We should scientifically scrutinize alleged breakthroughs carefully, especially in the period of AI hype we live now. It is actually responsibility of researchers in this area to accurately describe and advertise our achievements, and try not to contribute to the growing (often self-interested) misinformation and mystification of the field.

“I personally have a lot of hope in the potential of DeepMind in achieving relevant discoveries in AI, but I hope these achievements will be developed in a way that can be easily judged by peers and contribute to society."

Other machine-learning experts El Reg chatted to this week privately agreed that while AlphaZero is a cool research project, it is not quite the scientific breakthrough the mainstream press has been screaming about.

A spokesperson from DeepMind told The Register that it could not comment on any of the claims made since “the work is being submitted for peer review and unfortunately we cannot say any more at this time.” ®

Similar topics


Other stories you might like

  • SEC probes Musk for not properly disclosing Twitter stake
    Meanwhile, social network's board rejects resignation of one its directors

    America's financial watchdog is investigating whether Elon Musk adequately disclosed his purchase of Twitter shares last month, just as his bid to take over the social media company hangs in the balance. 

    A letter [PDF] from the SEC addressed to the tech billionaire said he "[did] not appear" to have filed the proper form detailing his 9.2 percent stake in Twitter "required 10 days from the date of acquisition," and asked him to provide more information. Musk's shares made him one of Twitter's largest shareholders. The letter is dated April 4, and was shared this week by the regulator.

    Musk quickly moved to try and buy the whole company outright in a deal initially worth over $44 billion. Musk sold a chunk of his shares in Tesla worth $8.4 billion and bagged another $7.14 billion from investors to help finance the $21 billion he promised to put forward for the deal. The remaining $25.5 billion bill was secured via debt financing by Morgan Stanley, Bank of America, Barclays, and others. But the takeover is not going smoothly.

    Continue reading
  • Cloud security unicorn cuts 20% of staff after raising $1.3b
    Time to play blame bingo: Markets? Profits? Too much growth? Russia? Space aliens?

    Cloud security company Lacework has laid off 20 percent of its employees, just months after two record-breaking funding rounds pushed its valuation to $8.3 billion.

    A spokesperson wouldn't confirm the total number of employees affected, though told The Register that the "widely speculated number on Twitter is a significant overestimate."

    The company, as of March, counted more than 1,000 employees, which would push the jobs lost above 200. And the widely reported number on Twitter is about 300 employees. The biz, based in Silicon Valley, was founded in 2015.

    Continue reading
  • Talos names eight deadly sins in widely used industrial software
    Entire swaths of gear relies on vulnerability-laden Open Automation Software (OAS)

    A researcher at Cisco's Talos threat intelligence team found eight vulnerabilities in the Open Automation Software (OAS) platform that, if exploited, could enable a bad actor to access a device and run code on a targeted system.

    The OAS platform is widely used by a range of industrial enterprises, essentially facilitating the transfer of data within an IT environment between hardware and software and playing a central role in organizations' industrial Internet of Things (IIoT) efforts. It touches a range of devices, including PLCs and OPCs and IoT devices, as well as custom applications and APIs, databases and edge systems.

    Companies like Volvo, General Dynamics, JBT Aerotech and wind-turbine maker AES are among the users of the OAS platform.

    Continue reading

Biting the hand that feeds IT © 1998–2022