By
In every cell, thousands of different proteins make up the machinery that keeps all living things – from humans and plants to microscopic bacteria – alive and healthy. Almost all diseases including cancer, dementia and even infectious diseases like COVID-19are related to how these proteins work. Because the function of any protein is directly related to its three-dimensional shape, scientists around the world have spent half a century trying to find an accurate and quick way to discover the shape of a protein.
Today (Monday), researchers from the 14th Community Wide Experiment for the Critical Assessment of Protein Structure Prediction Techniques (CASP14) will announce that an artificial intelligence (AI) solution has been found to this challenge.
Building on the work of hundreds of researchers around the world, an AI program called AlphaFold, developed by the London-based AI laboratory DeepMind, has proven capable of determining the shape of many proteins. It has done so to a certain extent accuracy comparable to expensive and time-consuming laboratory experiments.
CASP14 is organized by Dr. John Moult (Chairman), University of Maryland, USA; Dr. Krzysztof Fidelis, UC Davis, USA; Dr. Andriy Kryshtafovych, UC Davis, USA; Dr. Torsten Schwede, University of Basel and SIB Swiss Institute for Bioinformatics, Switzerland; and Dr. Maya Topf, Birkbeck, University of London, UK and CSSB (HPI and UKE) Hamburg, Germany.
Dr. Moult said, “Proteins are extremely complex molecules, and their precise three-dimensional structure is key to the many roles they play, such as insulin, which regulates sugar levels in our blood, and antibodies which make us Help fight infections. Even the tiny rearrangements of these vital molecules can have catastrophic effects on our health. One of the most efficient ways to understand diseases and find new treatments is to study the proteins involved.
“There are tens of thousands of human proteins and many billions in other species, including bacteria and viruses, but figuring out one shape requires expensive equipment and can take years.
“Almost 50 years ago, Christian Anfinsen was awarded a Nobel Prize for showing that it should be possible to determine the shape of proteins based on their sequence amino acids – the individual building blocks that make up proteins. That is why our scientific community worked on the biennial CASP challenge. ”
Teams taking part in the CASP challenge will receive the amino acid Sequences for a set of approximately 100 proteins. While scientists study the proteins in the laboratory to determine their shape experimentally, around 100 participating CASP teams from more than 20 countries will try to do the same with computers. The results are evaluated by independent scientists.
Dr. Fidelis said, “The CASP approach has created intense collaboration between researchers in this area of science and we have seen it accelerate scientific developments.
“Since we first met the challenge in 1994, we have seen a number of discoveries, each solving one aspect of this problem, making computed models of protein structures increasingly useful in medical research.”
In the final round of the challenge, DeepMind’s AlphaFold program determined the shape of approximately two-thirds of the proteins with an accuracy comparable to laboratory experiments *. AlphaFold’s accuracy with most other proteins was also high, if not quite on this level.
The CASP organizers say this success is based on the successes achieved in previous CASP rounds by both the DeepMind team and other participants, and that other teams participating in CASP14 also have some high-precision structures in this round have created.
Dr. Kryshtafovych said, “What AlphaFold has achieved is truly remarkable and today’s announcement is a win for DeepMind, but it’s also a triumph for team science. The unique and intense way in which we work with researchers around the world through CASP and the contributions of many teams of scientists over the years has led us to this breakthrough. ”
He adds, “The ability to study the shape of proteins quickly and accurately can revolutionize life sciences. With the problem largely resolved for individual proteins, new methods can be developed for determining the shape of protein complexes – collections of proteins that work together to make up much of the machinery of life and for other applications. ”
Professor Dame Janet Thornton, emeritus director of EMBL’s European Institute for Bioinformatics (EMBL-EBI) who is not affiliated with CASP or DeepMind, said, “One of the greatest mysteries in biology is how proteins fold to create extraordinarily unique ones to create three-dimensional structures. Every living being – from the smallest bacteria to plants, animals and humans – is defined and powered by the proteins that support it at the molecular level.
“So far, this mystery has remained unsolved, and determining a single protein structure has often required years of experimental effort. It is tremendous to see the triumph of human curiosity, aspiration, and intelligence in solving this problem. A better understanding of protein structures and the ability to predict them with the help of a computer means a better understanding of life, evolution, and of course human health and disease. ”
* AlphaFold produced models for approximately two-thirds of the CASP14 target proteins with global distance test results above 90 out of 100. Above the 90-point threshold, the remaining differences between the models and the experimental structures are small and artifacts of the size expected for the experiment and errors as well as alternative local low-energy conformations. Note that these CASP targets are individual proteins or domains, not protein complexes that are a next frontier. The global distance test is a measure of how closely the shape of the protein model matches the shape from laboratory experiments: Zemla A, Venclovas, Moult J, Fidelis K. Processing and evaluating predictions in CASP4. Proteins 2001; Suppl 5: 13-13. 21; Zemla A. LGA: A Method for Finding 3D Similarities in Protein Structures. Nucleic Acids Res 2003; 31 (13): 3370- 3374).
Meeting: 14th community-wide experiment to critically evaluate techniques for predicting protein structure.
Funding: CASP operations are supported in part by a grant from the National Institutes of Health, NIH R01GM100482.