The wfRstta-PQ2-Seder Branch

Posted by in WeFold3

E. Faraggi and A. Kloczkowski

We have participated in the Critical Assessment of Protein Structure
Prediction (CASP) experiment with four prediction procedures. The
procedure described in this abstract is labeled as group "wfRstta-PQ2-Seder",
number 067. This method is based on new version of the Seder program
[1,2], with new and improved input features as will be described in an
upcoming manuscript. For this procedure Seder was trained with soft and
hard protein targets. We use CASP5 through CASP10 server models for
training data, and CASP11 server models as a test set. That is, in this
case we train over all CASP targets and optimize the prediction on both
hard and soft CASP11 targets. This version of Seder is then used to pick
among all CASP12 submitted server models and among WeFold [3]
chain models where available. To estimate the B-factors for
the protein models we used the following equation: B-factor = 300 *
SPXASA / ( 1 + model-residue-depth), with SPXASA the SPINE-X [4]
predicted accessible surface area, and model-residue-depth is the
residue depth reported from the program DEPTH [5]. This model came from
approximately fitting a distribution of experimental B-factors.

1. Faraggi, Eshel, and Andrzej Kloczkowski. "A global machine learning
based scoring function for protein structure prediction." Proteins:
Structure, Function, and Bioinformatics 82.5 (2014): 752-759.

2. Manuscript in preparation.

3. Khoury, George A., et al. "WeFold: a coopetition for protein
structure prediction." Proteins: Structure, Function, and Bioinformatics
82.9 (2014): 1850-1868.

4. Faraggi, Eshel, et al. "SPINE X: improving protein secondary
structure prediction by multistep learning coupled with prediction of
solvent accessible surface area and backbone torsion angles." Journal of
computational chemistry 33.3 (2012): 259-267.

5. Tan, Kuan Pern, Raghavan Varadarajan, and Mallur S. Madhusudhan.
"DEPTH: a web server to compute depth and predict small-molecule binding
cavities in proteins." Nucleic acids research 39.suppl 2 (2011): W242-W248.