YRC Logo
Descriptions Names[Advanced Search]

MS Interpretation Guidelines

Guidelines for interpreting Mass spectrometry (MS) data:

What do the column headings mean?
The "sequence coverage" represents the percentage of the protein's sequence represented by the peptides identified in the MS run. The "sequence count" is the number of peptides that were used in the identification of the protein. The "spectrum count" is the total number of spectra found to correspond to the peptides used in the identification of the protein. One peptide may be represented by several spectra, so the sequence count and the spectrum count often do not match.

Sorting the list of proteins
The proteins identified in a mass spectrometry (MS) run are listed from highest to lowest sequence coverage by default. You may change the criterion for sorting by clicking on the column header.

Confidence of the identification
Many factors contribute to the confidence of the identification. Proteins with a high sequence coverage can be regarded with a higher degree of confidence, whereas proteins with a low sequence coverage should be regarded with higher scrutiny-unless validated in other MS runs or in other experiments (such as yeast two-hybrid). A protein with a higher sequence or spectrum count can also be regarded with higher confidence. All proteins listed are identified by at least two peptides (sequence count of 2 or greater), which we consider a minimum criterion for identification.

Another consideration is how well the mass spectrum matches the expected spectrum for a given peptide. To view this, click on the "Peptides" link next to the protein listed in the MS results. The columns marked XCorr and DeltaCn, give the scores for the tandem mass spectrum's match to the sequence. The values under XCorr are obtained from the cross-correlation analysis. The larger the value the closer the fit between the experimental tandem mass spectrum and the model tandem mass spectrum constructed from the sequence. Any given spectrum will match several sequences. The one shown is the best match.

The DeltaCn value is the difference between the XCorr values for the best match and the second best match. The larger the DeltaCn the better the match with the first sequence as compared to the second sequence. A low DeltaCn suggests that the first match is less reliable.

If two proteins are found together in the same run, are they in a complex?
This is a difficult question to answer. There are two extremes. One extreme is a mass spec run with many proteins each one identified by only a few peptides. We would put little stock in this data and improve our purification. The other extreme is a mass spec run with between 5-10 proteins, each one with 70-90% sequence coverage. We would consider this good evidence for a single complex of proteins. Even in the best case, we recommend purifying the complex several times and repeating the mass spec analysis.

No single attribute serves as the definitive guide of confidence, and interpretation of MS results requires that one take into consideration all of these factors-as well as some knowledge of biochemistry. For more information, please visit our introduction to mass spectrometry.

Note that protein interactions predicted in runs labeled as unpublished should be regarded with less confidence than those in runs associated with a publication. Any questions regarding the unpublished runs, such as protocols used or help with interpretation, should be addressed to the appropriate investigator. We suggest registering with the YRC and selecting "Ask Question" from the main menu. This will ensure your question is delivered to the appropriate investigator. Please be as specific as possible in your correspondence.


YRC Informatics Platform - Version 3.0
Created and Maintained by: Michael Riffle