In 2020, an artificial intelligence lab called DeepMind unveiled technology that could predict the shape of proteins — the microscopic mechanisms that drive the behavior of the human body and all other living things.
A year later, the lab shared the tool, called AlphaFold, with scientists and released predicted shapes for more than 350,000 proteins, including all proteins expressed by the human genome. It immediately shifted the course of biological research. If scientists can identify the shapes of proteins, they can accelerate the ability to understand diseases, create new medicines and otherwise probe the mysteries of life on Earth.
Now, DeepMind has released predictions for nearly every protein known to science. On Thursday, the London-based lab, owned by the same parent company as Google, said it had added more than 200 million predictions to an online database freely available to scientists across the globe.
With this new release, the scientists behind DeepMind hope to speed up research into more obscure organisms and spark a new field called metaproteomics.
“Scientists can now explore this entire database and look for patterns — correlations between species and evolutionary patterns that might not have been evident until now,” Demis Hassabis, the chief executive of DeepMind, said in a phone interview.
Proteins begin as strings of chemical compounds, then twist and fold into three-dimensional shapes that define how these molecules bind to others. If scientists can pinpoint the shape of a particular protein, they can decipher how it operates.
This knowledge is often a vital part of the fight against illness and disease. For instance, bacteria resist antibiotics by expressing certain proteins. If scientists can understand how these proteins operate, they can begin to counter antibiotic resistance.
Previously, pinpointing the shape of a protein required extensive experimentation involving X-rays, microscopes and other tools on a lab bench. Now, given the string of chemical compounds that make up a protein, AlphaFold can predict its shape.
The technology is not perfect. But it can predict the shape of a protein with an accuracy that rivals physical experiments about 63 percent of the time, according to independent benchmark tests. With a prediction in hand, scientists can verify its accuracy relatively quickly.
Kliment Verba, a researcher at the University of California, San Francisco, who uses the technology to understand the coronavirus and to prepare for similar pandemics, said the technology had “supercharged” this work, often saving months of experimentation time. Others have used the tool as they struggle to fight gastroenteritis, malaria and Parkinson’s disease.
The technology has also accelerated research beyond the human body, including an effort to improve the health of honeybees. DeepMind’s expanded database can help an even larger community of scientists reap similar benefits.
Like Dr. Hassabis, Dr. Verba believes the database will provide new ways of understanding how proteins behave across species. He also sees it as a way of educating a new generation of scientists. Not all researchers are versioned in this kind of structural biology; a database of all known proteins lowers the bar to entry. “It can bring structural biology to the masses,” Dr. Verb said.