Proteins are the building blocks of life as they carry out most cellular functions. They are made of an extended sequence of amino acids (primary structure) and then fold into a compact shape (tertiary structure) that determines their function. The protein structure prediction problem is to efficiently predict a protein’s 3D structure from its primary structure. The 3D structure is key to understanding the mechanisms of life, to finding new drugs to combat disease, and to designing new proteins with desired functions not currently found in nature. Diseases associated with proteins not working properly include:
Mad cow and
In recent years, a vast amount of protein sequence information has become available from an explosion of genome projects and high-throughput sequencing techniques. However, sequence alone is not enough; researchers need a protein’s 3D structure to determine its function. Current experimental approaches are very time consuming and expensive, thus creating the need to develop fast and accurate computational methods to complement the experimental ones.
Unfortunately, the protein structure prediction problem is insanely hard!
The conformational space of a protein (the number of 3D conformations a primary sequence can fold into) grows exponentially with the number of amino acids in the sequence and thus, what takes microseconds for nature, may take years for supercomputers.
To assess and advance the field, Moult and his colleagues created an experiment called Critical Assessment of techniques for protein Structure Prediction (CASP). Launched in 1994 and executed every other summer, CASP has challenged scientists worldwide to submit blind predictions of proteins 3D structures, called targets, using only the primary structure. The 3D structures of these proteins have been determined experimentally, but the results are not published until the end of the experiment. Thus, the submitted computational models are blindly predicted, and the results of the assessment performed by CASP-appointed experts are only known after the experiment is over.
CASP has shown substantial improvements. However, no single group has yet been able to consistently predict the structure of ‘hard’ proteins with even moderate accuracy. Moreover, recent progress reports by both CASP assessors and organizers show incremental improvements in this category since CASP6 (held in 2004).
To catalyze larger and more rapid advances in the field, we started WeFold, an open online collaborative effort mediated by the science gateway http://www.wefold.org. Initiated in 2012, just in time for CASP10, the project brought together 31 scientists from 13 labs around the world using a wide range of methods. One of these labs, Foldit, uses a computer game to create protein models and has recruited more than 300,000 citizen scientists to the effort. Check it out!
We’re doing WeFold again this year in the context of CASP11 and we have designed this new gateway that allows students and researchers that are not directly associated with a CASP team to discuss protein structure prediction and propose new ideas.
We need fresh, exciting ideas and that’s why we want you to join in!
We can test those ideas on previous CASP targets for which we know the experimental structure and can compare results. Once the ideas become mature enough, then we can try them in either CASP11 or beyond.