Google's claims of super-human AI chip layout back under the microscope
Nature probes published research as it emerges journal paper allegedly used to entice $120m cloud deal
Special report A Google-led research paper published in Nature, claiming machine-learning software can design better chips faster than humans, has been called into question after a new study disputed its results.
In June 2021, Google made headlines for developing a reinforcement-learning-based system capable of automatically generating optimized microchip floorplans. These plans determine the arrangement of blocks of electronic circuitry within the chip: where things such as the CPU and GPU cores, and memory and peripheral controllers, actually sit on the physical silicon die.
Google said it was using this AI software to design its homegrown TPU chips that accelerate AI workloads: it was employing machine learning to make its other machine-learning systems run faster.
The floorplan of a chip is important because it dictates how well the processor performs. You will want to arrange blocks of the chip's circuits carefully so that, for example, signals and data propagate between these areas at a desirable rate. Engineers typically spend weeks or months refining their designs trying to find the optimal configuration. All of the different subsystems have to be placed in a particular way to produce a chip as powerful, energy efficient, and small as possible.
Producing a floorplan today usually involves a mix of manual work and automation using chip design applications. Google's team sought to demonstrate that its reinforcement-learning approach would produce designs better than those made just by human engineers using industry tools. Not only that, Google said its model completed its work much faster than engineers iterating over layouts.
"Despite five decades of research, chip floorplanning has defied automation, requiring months of intense effort by physical design engineers to produce manufacturable layout … In under six hours, our method automatically generates chip floorplans that are superior or comparable to those produced by humans in all key metrics," the Googlers wrote in their Nature paper.
The research got the attention of the electronic design automation community, which was already moving toward incorporating machine-learning algorithms into their software suites. Now Google's claims of its better-than-humans model has been challenged by a team at the University of California, San Diego (UCSD).
Led by Andrew Kahng, a professor of computer science and engineering, that group spent months reverse engineering the floorplanning pipeline Google described in Nature. The web giant withheld some details of its model's inner workings, citing commercial sensitivity, so the UCSD had to figure out how to make their own complete version to verify the Googlers' findings. Prof Kahng, we note, served as a reviewer for Nature during the peer-review process of Google's paper.
The university academics ultimately found their own recreation of the original Google code, referred to as circuit training (CT) in their study, actually performed worse than humans using traditional industry methods and tools.
What could have caused this discrepancy? One might say the recreation was incomplete, though there may be another explanation. Over time, the UCSD team learned Google had used commercial software developed by Synopsys, a major maker of electronic design automation (EDA) suites, to create a starting arrangement of the chip's logic gates that the web giant's reinforcement learning system then optimized.
Experiments show that having initial placement information can significantly enhance CT outcomes
The Google paper did mention that industry-standard software tools and manual tweaking were used after the model had generated a layout, primarily to ensure the processor would work as intended and finalize it for fabrication. The Googlers argued this was a necessary step whether the floorplan was created by a machine-learning algorithm or by humans with standard tools, and thus its model deserved credit for the optimized end product.
However, the UCSD team said there was no mention in the Nature paper of EDA tools being used beforehand to prepare a layout for the model to improve. It's argued these Synopsys tools may have given the model a decent enough head start that the AI system's true capabilities should be called into question.
"This was not apparent during the paper review," the university team wrote of the use of Synopsys' suite to prep a layout for the model, "and is not mentioned in Nature. Experiments show that having initial placement information can significantly enhance CT outcomes."
Nature investigates Google's research
Some academics have since urged Nature to review Google's paper in light of UCSD's study. In emails to the journal viewed by The Register, researchers highlighted concerns raised by Prof Kahng and his colleagues, and questioned whether Google's paper was misleading.
Bill Swartz, a senior lecturer teaching electrical engineering at the University of Texas at Dallas, said the Nature paper "left a lot of [researchers] in the dark" since the results involved the internet titan's proprietary TPUs and, therefore, impossible to verify.
The use of Synopsys' software to prime Google's software needs to be investigated, he said. "We all just want to know the actual algorithm so we can reproduce it. If [Google's] claims are right, then we want to implement it. There should be science, it should all be objective; if it works, it works," he said.
Nature told The Register it is looking into Google's paper, though it did not say exactly what it was investigating nor why.
"We cannot comment on the details of individual cases for confidentiality reasons," a spokesperson for Nature told us. "However, speaking generally, when concerns are raised about any paper published in the journal, we look into them carefully following an established process.
"This process involves consultation with the authors and, where appropriate, seeking advice from peer reviewers and other external experts. Once we have enough information to make a decision we follow up with the response that is most appropriate and that provides clarity for our readers as to the outcome."
This isn't the first time the journal has performed a post-publication probe into the study, which is facing renewed scrutiny. The Googlers' paper has remained online with an author correction added in March 2022, which included a link to some of Google's open source CT code for those trying to follow the study's methods.
No-pretraining and not enough compute?
The lead authors of Google's paper, Azalia Mirhoseini and Anna Goldie, said the UCSD team's work isn't an accurate implementation of their method. They pointed out that Prof Kahng's group obtained worse results since they didn't pre-train their model on any data at all.
"A learning-based method will of course perform worse if it is not allowed to learn from prior experience. In our Nature paper, we pre-train on 20 blocks before evaluating on held-out test cases," the two said in a statement [PDF].
- Nvidia hooks TSMC, ASML, Synopsys on GPU accelerated lithography
- If you want to make your own chip and aren't Microsoft rich, who do you turn to?
- Barred from US tech, Huawei claims to have built its own 14nm chip design suite
- Chip design software giant Synopsys probed for potential forbidden deals with Huawei
Prof Kahng's team also did not train their system using the same amount of computing power as Google used, and suggested this step may not have been carried out properly, crippling the model's performance. Mirhoseini and Goldie also said the pre-processing step using EDA applications that was not explicitly described in their Nature paper wasn't important enough to mention.
"The [UCSD] paper focuses on the use of the initial placement from physical synthesis to cluster standard cells, but this is of no practical concern. Physical synthesis must be performed before running any placement method," they said. "This is standard practice in chip design."
The UCSD group, however, said they didn't pre-train their model because they didn't have access to the Google proprietary data. They claimed, however, their software had been verified by two other engineers at the internet giant, who were also listed as co-authors of the Nature paper. Prof Kahng is presenting his team's study at this year's International Symposium on Physical Design conference Tuesday.
Meanwhile, Google continues to use reinforcement-learning-based techniques to enhance its TPUs, which are actively used in its datacenters.
Fired Googler claims research was hyped for a lucrative cloud deal
Separately, Google's Nature paper's claims of superhuman performance were disputed internally within the internet goliath. In May last year, Satrajit Chatterjee, an AI researcher, was fired from Google with cause; he claimed he was let go was because he had criticized the Nature study and contested the paper's findings. Chatterjee was also told Google wouldn't publish his paper critiquing the first study.
He was also accused by other Googlers of going too far in his criticism – such as, for instance, allegedly verbally describing the work as a "train wreck" and a "tire fire" – and was placed under HR investigation for his alleged behavior.
Chatterjee has since sued Google in the Superior Court of California in Santa Clara claiming wrongful termination. Chatterjee declined to comment for this story, and he denies any wrongdoing. Mirhoseini and Goldie left Google in mid-2022 after Chatterjee was axed.
In his complaint against Google, which was amended [PDF] last month, Chatterjee's lawyers claimed the web giant was thinking about commercializing its AI-based floorplan-generating software with "Company S" while it was negotiating a Google Cloud deal reportedly worth $120 million with S at the time. Chatterjee claimed Google championed the floorplan paper to help convince Company S to get onboard with this significant commercial pact.
"The study was done in part as a first step toward potential commercialization with [Company S] (and conducted with resources from [Company S]). Since it was done in the context of a large potential Cloud deal, it would have been unethical to imply that we had revolutionary technology when our tests showed otherwise," Chatterjee wrote in an email to Google's CEO Sundar Pichai, Vice President and Engineering Fellow Jay Yagnik, and VP of Google Research Rahul Sukthankar, which was disclosed as part of the lawsuit.
His court filings accused Google of "overstating" its study's results, and "deliberately withholding material information from Company S to induce it to sign a cloud computing deal," effectively wooing the other business using what he saw as questionable technology.
Company S is described as an "electronic design automation company" in the court documents. People familiar with the matter told The Register Company S is Synopsys.
Synopsys and Google declined to comment. ®
Is there a story inside the world of artificial intelligence that you want to share? Talk to us in confidence.