MassIVE MSV000080712

Imported Reanalysis Dataset Public PXD002613

Exome-based proteogenomics of HEK-293 human cell line: coding genomic variants identified at the level of shotgun proteome

Description

Genomic and proteomic data were integrated into the proteogenomic workflow to identify coding genomic variants of Human Embryonic Kidney 293 (HEK-293) standard cell line at the proteome level. Shotgun proteome data published by Geiger et al (2012) and obtained in this work for HEK-293 were searched against the customized genomic databased generated using exome data published by Lin et al (2014). 54 unique variants out of ~1,200 coding variants annotated in the exome were found at the proteome level. 27 of them were validated by two search engines, X!Tandem and Andromeda. 16 (60%) of those validated variants were confidently identified in both own and published proteome datasets. Some of the variants found belonged to widely known genomic polymorphisms originated from the germline, while others are more likely to result from somatic mutations. Notably, the peptide subsets identified by only one, or the other search engine were enriched by the sequences with miscleavages. This can be due to the large presence of false-positive hits in these subsets that is especially true for the subset of variant peptides. High-resolution mass-spectra of HEK-293 cell line were deposited to ProteomeXchange repository, project accession PXD002613. [dataset license: CC0 1.0 Universal (CC0 1.0)]

Keywords: Proteogenomics ; HEK-293 cell line ; Exome ; Shotgun proteomics

Contact

Principal Investigators:
(in alphabetical order)
Sergei Moshkovskii, Laboratory of Medical Proteomics, Department of Personalized Medicine, Institute of Biomedical Chemistry, N/A
Submitting User: ccms

Publications

Lobas AA, Karpov DS, Kopylov AT, Solovyeva EM, Ivanov MV, Ilina IY, Lazarev VN, Kuznetsova KG, Ilgisonis EV, Zgoda VG, Gorshkov MV, Moshkovskii SA.
Exome-based proteogenomics of HEK-293 human cell line: Coding genomic variants identified at the level of shotgun proteome.
Proteomics. 2016 Jul;16(14):1980-91. Epub 2016 Jun 21.

Number of Files:
Total Size:
Spectra:
Subscribers:
 
Owner Reanalyses
Experimental Design
    Conditions:
    Biological Replicates:
    Technical Replicates:
 
Identification Results
    Proteins (Human, Remapped):
    Proteins (Reported):
    Peptides:
    Variant Peptides:
    PSMs:
 
Quantification Results
    Differential Proteins:
    Quantified Proteins:
 
Browse Dataset Files
 
FTP Download Link (click to copy):

- Dataset Reanalyses


+ Dataset History


Click here to queue conversion of this dataset's submitted spectrum files to open formats (e.g. mzML). This process may take some time.

When complete, the converted files will be available in the "ccms_peak" subdirectory of the dataset's FTP space (accessible via the "FTP Download" link to the right).
Number of distinct conditions across all analyses (original submission and reanalyses) associated with this dataset.

Distinct condition labels are counted across all files submitted in the "Metadata" category having a "Condition" column in this dataset.

"N/A" means no results of this type were submitted.
Number of distinct biological replicates across all analyses (original submission and reanalyses) associated with this dataset.

Distinct replicate labels are counted across all files submitted in the "Metadata" category having a "BioReplicate" or "Replicate" column in this dataset.

"N/A" means no results of this type were submitted.
Number of distinct technical replicates across all analyses (original submission and reanalyses) associated with this dataset.

The technical replicate count is defined as the maximum number of times any one distinct combination of condition and biological replicate was analyzed across all files submitted in the "Metadata" category. In the case of fractionated experiments, only the first fraction is considered.

"N/A" means no results of this type were submitted.
Originally identified proteins that were automatically remapped by MassIVE to proteins in the SwissProt human reference database.

"N/A" means no results of this type were submitted.
Number of distinct protein accessions reported across all analyses (original submission and reanalyses) associated with this dataset.

"N/A" means no results of this type were submitted.
Number of distinct unmodified peptide sequences reported across all analyses (original submission and reanalyses) associated with this dataset.

"N/A" means no results of this type were submitted.
Number of distinct peptide sequences (including modified variants or peptidoforms) reported across all analyses (original submission and reanalyses) associated with this dataset.

"N/A" means no results of this type were submitted.
Total number of peptide-spectrum matches (i.e. spectrum identifications) reported across all analyses (original submission and reanalyses) associated with this dataset.

"N/A" means no results of this type were submitted.
Number of distinct proteins quantified across all analyses (original submission and reanalyses) associated with this dataset.

Distinct protein accessions are counted across all files submitted in the "Statistical Analysis of Quantified Analytes" category having a "Protein" column in this dataset.

"N/A" means no results of this type were submitted.
Number of distinct proteins found to be differentially abundant in at least one comparison across all analyses (original submission and reanalyses) associated with this dataset.

A protein is differentially abundant if its change in abundance across conditions is found to be statistically significant with an adjusted p-value <= 0.05 and lists no issues associated with statistical tests for differential abundance.

Distinct protein accessions are counted across all files submitted in the "Statistical Analysis of Quantified Analytes" category having a "Protein" column in this dataset.

"N/A" means no results of this type were submitted.
This dataset may not contain all raw spectra data as originally deposited in PRIDE. It has been imported to MassIVE for reanalysis purposes, so its spectra data here may consist solely of processed peak lists suitable for reanalysis with most software.