This article is more than 1 year old

Nearly all protein structures known to science predicted by AlphaFold AI

Let's hold off the champagne until an actual drug is developed using this tech

The AI-powered protein-folding model AlphaFold has predicted more than 200 million proteins, nearly all such structures known to science, DeepMind said on Thursday.

Proteins are complex biological molecules produced in living organisms from instructions stored in DNA. Made from as many as 20 types of amino acids, these nano-scale chains perform vital cellular tasks to carry out all sorts of bodily functions. Knowing the three-dimensional form of proteins is important since its physical structure provides hints at how it behaves, and what purpose it serves, which helps us do things like develop drugs, and create copycat proteins for those lacking them.

Some proteins are helpful, such as those involved in digesting food while others can be harmful, such as those involved in the growth of tumors. Figuring out their complicated wriggly shapes, however, is difficult. Molecular biologists can spend years conducting experiments to decipher a protein's structure, and AlphaFold can do this in minutes, depending on how large the molecule is, from the amino acid composition. 

AlphaFold was trained on hundreds of thousands of known protein structures, and learned the relationships between the constituent amino acids and the final overall shapes. Given an arbitrary input amino acid sequence, the model can predict a 3D protein structure. Now, the model has predicted nearly all protein structures known to science.


DeepMind's latest protein-solving AI AlphaFold a step closer to cracking biology's 50-year conundrum


Working together with the European Bioinformatics Institute, DeepMind has expanded its AlphaFold Protein Structure Database to contain over 200 million 3D shapes of proteins from animals to plants, bacteria to viruses – an increase of more than 200x from nearly a million molecules to at least 200 million molecules in just a year.

"We hoped this groundbreaking resource would help accelerate scientific research and discovery globally, and that other teams could learn from and build on the advances we made with AlphaFold to create further breakthroughs," Demis Hassibis, DeepMind's co-founder and CEO, said in a statement Thursday.

"That hope has become a reality far quicker than we had dared to dream. Just twelve months later, AlphaFold has been accessed by more than half a million researchers and used to accelerate progress on important real-world problems ranging from plastic pollution to antibiotic resistance."

The Register has asked DeepMind for further comment. 

AlphaFold has also shown great potential for designing new drugs. The structures help scientists figure out chemical compounds that can bind to target proteins to treat or prevent them from carrying out pathological functions. Companies including Insilico Medicine have experimented with the model to discover new drugs; CEO Alex Zhavoronkov told The Register that the process is much more complicated than you might think, and involves several steps.

It's not clear how fully accurate AlphaFold's predictions are. A protein's ribbon-like structure often changes shape when it interacts with a drug, something AlphaFold cannot help scientists with as it's not trained on that. Zhavoronkov said the model is a "pretty remarkable breakthrough" but was wary of all the hype. 

"Until we see a structure for a novel target in a big disease obtained via AlphaFold without any additional experiments, a molecule designed using AI – or other methods – using this predicted structure, synthesized and tested all the way and then published in a high journal – [we can] then celebrate."

Big pharma want to see molecules designed with the help of AI tools like AlphaFold actually tested in mice and humans. "Pure algorithmic achievements are not valuable to the pharma companies and especially to the patients," Zhavoronkov added.

Fabio Urbina, a senior scientist at Collaboration Pharmaceuticals, a startup using machine-learning algorithms to develop drugs for rare genetic diseases, said AlphaFold hasn't quite yet proved to be useful in his research. Urbina uses a different technique and focuses more on the structure of a potential new drug rather than a target protein.

It has yet to be seen if the protein structures will be useful enough ... to help us discover new potential drugs for rare disease

"This is for a few reasons; the protein structures for a lot of drug targets were often not easily available for researchers to use, and protein information did not seem to help the early machine learning models improve their predictive power by a significant margin," he told The Register.

"I'm cautiously optimistic that AlphaFold has essentially 'solved' the first problem, but it has yet to be seen if the protein structures will be useful enough for our downstream application of improving machine-learning predictive power to help us discover new potential drugs for rare diseases. However, we've increasingly seen protein structural information taken into account as part of newer machine-learning methods, and we've thought about doing the same."

Making a database with nearly all known protein structures available, as DeepMind has promised, means more scientists will have the resources to experiment and build more powerful AI models, Urbina said. "I'm cautiously optimistic, but with the whole library of protein structures available, I would say there is a good chance that AlphaFold structures will be incorporated into some of our machine-learning models, and may ultimately help us to discover novel therapeutics." ®

More about


Send us news

Other stories you might like