Superfamily assignments for the yeast proteome through integration of structure prediction with the gene ontology

PLoS Biol. 2007 Apr;5(4):e76. doi: 10.1371/journal.pbio.0050076.

Abstract

Saccharomyces cerevisiae is one of the best-studied model organisms, yet the three-dimensional structure and molecular function of many yeast proteins remain unknown. Yeast proteins were parsed into 14,934 domains, and those lacking sequence similarity to proteins of known structure were folded using the Rosetta de novo structure prediction method on the World Community Grid. This structural data was integrated with process, component, and function annotations from the Saccharomyces Genome Database to assign yeast protein domains to SCOP superfamilies using a simple Bayesian approach. We have predicted the structure of 3,338 putative domains and assigned SCOP superfamily annotations to 581 of them. We have also assigned structural annotations to 7,094 predicted domains based on fold recognition and homology modeling methods. The domain predictions and structural information are available in an online database at http://rd.plos.org/10.1371_journal.pbio.0050076_01.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Bayes Theorem
  • Genes, Fungal*
  • Protein Conformation
  • Proteome*
  • Saccharomyces cerevisiae / genetics
  • Saccharomyces cerevisiae / metabolism*
  • Saccharomyces cerevisiae Proteins / chemistry
  • Saccharomyces cerevisiae Proteins / genetics
  • Saccharomyces cerevisiae Proteins / metabolism*

Substances

  • Proteome
  • Saccharomyces cerevisiae Proteins