MassIVE MSV000080641

Imported Reanalysis Dataset Public PXD004010

JUMPg: an Integrative Proteogenomics Pipeline Identifying Unannotated Proteins in Human Brain and Cancer Cells

Description

Proteogenomics is an emerging approach to improve gene annotation and interpretation of proteomics data. Here we present JUMPg, an integrative proteogenomics pipeline including customized database construction, tag-based database search, peptide-spectrum match filtering, and data visualization. JUMPg creates multiple databases of DNA polymorphisms, mutations, splice junctions, partially trypticity, as well as protein fragments translated from the whole transcriptome in all six frames after RNA-seq de novo assembly. We use a multistage strategy to search these databases sequentially, in which the performance is optimized by re-searching only unmatched high quality spectra, and re-using amino acid tags generated by the JUMP search engine. The identified peptides/proteins are displayed with gene loci using the UCSC genome browser. The JUMPg is applied to process a label-free mass spectrometry dataset of Alzheimer’s disease postmortem brain, uncovering 496 new peptides of amino acid substitutions, alternative splicing, frame shift, and “non-coding gene” translation. The novel protein PNMA6BL specifically expressed in the brain is highlighted. We also tested JUMPg to analyze a stable-isotope labeled dataset of multiple myeloma cells, revealing 991 sample-specific peptides that include protein sequences in the immunoglobulin light chain variable region. Thus, the JUMPg program is an effective proteogenomics tool for multi-omics data integration. [dataset license: CC0 1.0 Universal (CC0 1.0)]

Keywords: Genomics ; proteomics ; mass spectrometry ; proteogenomics ; RNA-seq ; database search ; multistage analysis ; spectrum quality control

Contact

Principal Investigators: (in alphabetical order)	Junmin Peng, St. Jude Children's Research Hospital, N/A
Submitting User:	ccms

Publications

Li Y, Wang X, Cho JH, Shaw TI, Wu Z, Bai B, Wang H, Zhou S, Beach TG, Wu G, Zhang J, Peng J.
JUMPg: An Integrative Proteogenomics Pipeline Identifying Unannotated Proteins in Human Brain and Cancer Cells.
J. Proteome Res. 2016 Jul 1;15(7):2309-20. Epub 2016 Jun 13.

Number of Files:
Total Size:
Spectra:
Subscribers:

	Owner	Reanalyses
Experimental Design
Conditions:
Biological Replicates:
Technical Replicates:

Identification Results
Proteins (Human, Remapped):
Proteins (Reported):
Peptides:
Variant Peptides:
PSMs:

Quantification Results
Differential Proteins:
Quantified Proteins:

Browse Dataset Files

FTP Download Link (click to copy):

- Dataset Reanalyses

+ Dataset History

Number of distinct proteins found to be differentially abundant in at least one comparison across all analyses (original submission and reanalyses) associated with this dataset.

A protein is differentially abundant if its change in abundance across conditions is found to be statistically significant with an adjusted p-value <= 0.05 and lists no issues associated with statistical tests for differential abundance.

Distinct protein accessions are counted across all files submitted in the "Statistical Analysis of Quantified Analytes" category having a "Protein" column in this dataset.

"N/A" means no results of this type were submitted.