MassIVE MSV000078530

Partial Public

Upload for 2012 MCP manuscript - "Shotgun protein sequencing with meta-contig assembly."

Comment Reanalyze Spectra Add Reanalysis

Description

Shotgun protein sequencing with meta-contig assembly. Full-length de novo sequencing from tandem mass (MS/MS) spectra of unknown proteins such as antibodies or proteins from organisms with unsequenced genomes remains a challenging open problem. Conventional algorithms designed to individually sequence each MS/MS spectrum are limited by incomplete peptide fragmentation or low signal to noise ratios and tend to result in short de novo sequences at low sequencing accuracy. Our shotgun protein sequencing (SPS) approach was developed to ameliorate these limitations by first finding groups of unidentified spectra from the same peptides (contigs) and then deriving a consensus de novo sequence for each assembled set of spectra (contig sequences). But whereas SPS enables much more accurate reconstruction of de novo sequences longer than can be recovered from individual MS/MS spectra, it still requires error-tolerant matching to homologous proteins to group smaller contig sequences into full-length protein sequences, thus limiting its effectiveness on sequences from poorly annotated proteins. Using low and high resolution CID and high resolution HCD MS/MS spectra, we address this limitation with a Meta-SPS algorithm designed to overlap and further assemble SPS contigs into Meta-SPS de novo contig sequences extending as long as 100 amino acids at over 97% accuracy without requiring any knowledge of homologous protein sequences. We demonstrate Meta-SPS using distinct MS/MS data sets obtained with separate enzymatic digestions and discuss how the remaining de novo sequencing limitations relate to MS/MS acquisition settings. [dataset license: CC0 1.0 Universal (CC0 1.0)]

Keywords: N/A

Contact

Principal Investigators: (in alphabetical order)	Dr. Nuno Bandeira
Submitting User:	aguthals

Number of Files:
Total Size:
Spectra:
Subscribers:

	Owner	Reanalyses
Experimental Design
Conditions:
Biological Replicates:
Technical Replicates:

Identification Results
Proteins (Human, Remapped):
Proteins (Reported):
Peptides:
Variant Peptides:
PSMs:

Quantification Results
Differential Proteins:
Quantified Proteins:

Browse Dataset Files

FTP Download Link (click to copy):

Species

Instrument

LTQ Orbitrap

Modifications

- Dataset Reanalyses

+ Dataset History

Number of distinct proteins found to be differentially abundant in at least one comparison across all analyses (original submission and reanalyses) associated with this dataset.

A protein is differentially abundant if its change in abundance across conditions is found to be statistically significant with an adjusted p-value <= 0.05 and lists no issues associated with statistical tests for differential abundance.

Distinct protein accessions are counted across all files submitted in the "Statistical Analysis of Quantified Analytes" category having a "Protein" column in this dataset.

"N/A" means no results of this type were submitted.