MassIVE MSV000086810

Complete Public

Integrated Proteogenomic Characterization across Major Histological Types of Pediatric Brain Cancer

Description

Francesca Petralia,Nicole Tignor,Boris Reva, et al., (2020) Cell 183, 1962-1985

We report a comprehensive proteogenomics analysis, including whole-genome sequencing, RNA sequencing, and proteomics and phosphoproteomics profiling, of 218 tumors across 7 histological types of childhood brain cancer: low-grade glioma (n = 93), ependymoma (32), high-grade glioma (25), medulloblastoma (22), ganglioglioma (18), craniopharyngioma (16), and atypical teratoid rhabdoid tumor (12). Proteomics data identify common biological themes that span histological boundaries, suggesting that treatments used for one histological type may be applied effectively to other tumors sharing similar proteomics features. Immune landscape characterization reveals diverse tumor microenvironments across and within diagnoses. Proteomics data further reveal functional effects of somatic mutations and copy number variations (CNVs) not evident in transcriptomics data. Kinase-substrate association and co-expression network analysis identify important biological mechanisms of tumorigenesis. This is the first large-scale proteogenomics analysis across traditional histological boundaries to uncover foundational pediatric brain tumor biology and inform rational treatment selection.

The Children's Brain Tumor Network (CBTN) and the Pacific Pediatric Neuro-Oncology Consortia (PNOC) are collaborative research consortia focused on identifying therapies for children with brain tumors (https://cbttc.org). The consortia have contributed a Pediatric Brain Tumor Atlas (PBTA) dataset, a cohort of 991 brain tumor subject clinical data, with associated whole genome sequencing and RNAseq hosted by the Gabriella Miller Kids First Data Resource (kidsfirstdrc.org) as part of the Gabriella Miller Kids First Pediatric Research Program (Kids First). Kids First is a Pan NIH Common Fund program dedicated to the development of large-scale data resources to help researchers uncover new insights into the biology of childhood cancer and structural birth defects (https://commonfund.nih.gov/KidsFirst).

Mass Spectrometry raw data generation along with preliminary analyses were performed at the Thermo Fisher Scientific Center for Multiplexed Proteomics (TCMP), Harvard Medical School (HMS) under the direction of Prof. Steven Gygi. Samples were prepared using the streamline (SL)-TMT protocol (Navarrete-Perea et al., 2018) and MS analysis was performed using the SPS-MS3 strategy (Ting et al., 2011) developed in the Gygi lab. Additional data analyses from the CPTAC Common Data Analysis Pipeline (Rudnick et al., 2016) and from the University of Michigan Proteomics and Integrative Bioinformatics Laboratory (https://github.com/Nesvilab) are also provided. All raw and processed genomic data, as well as pathology reports, radiology reports, MRIs, histology slide images are accessible via the Kids First DRC (Data Resource Center).

[doi:10.25345/C5F21M] [dataset license: Custom User License]

Keywords: CPTAC

Contact

Principal Investigators:
(in alphabetical order)
Steven P Gygi, Department of Cell Biology, Harvard Medical School, Boston, USA, N/A
Submitting User: ccms
Number of Files:
Total Size:
Spectra:
Subscribers:
 
Owner Reanalyses
Experimental Design
    Conditions:
    Biological Replicates:
    Technical Replicates:
 
Identification Results
    Proteins (Human, Remapped):
    Proteins (Reported):
    Peptides:
    Variant Peptides:
    PSMs:
 
Quantification Results
    Differential Proteins:
    Quantified Proteins:
 
Browse Dataset Files Browse Results
Browse Quantification Results
 
FTP Download Link (click to copy):

- Dataset Reanalyses


+ Dataset History


Click here to queue conversion of this dataset's submitted spectrum files to open formats (e.g. mzML). This process may take some time.

When complete, the converted files will be available in the "ccms_peak" subdirectory of the dataset's FTP space (accessible via the "FTP Download" link to the right).
Number of distinct conditions across all analyses (original submission and reanalyses) associated with this dataset.

Distinct condition labels are counted across all files submitted in the "Metadata" category having a "Condition" column in this dataset.

"N/A" means no results of this type were submitted.
Number of distinct biological replicates across all analyses (original submission and reanalyses) associated with this dataset.

Distinct replicate labels are counted across all files submitted in the "Metadata" category having a "BioReplicate" or "Replicate" column in this dataset.

"N/A" means no results of this type were submitted.
Number of distinct technical replicates across all analyses (original submission and reanalyses) associated with this dataset.

The technical replicate count is defined as the maximum number of times any one distinct combination of condition and biological replicate was analyzed across all files submitted in the "Metadata" category. In the case of fractionated experiments, only the first fraction is considered.

"N/A" means no results of this type were submitted.
Originally identified proteins that were automatically remapped by MassIVE to proteins in the SwissProt human reference database.

"N/A" means no results of this type were submitted.
Number of distinct protein accessions reported across all analyses (original submission and reanalyses) associated with this dataset.

"N/A" means no results of this type were submitted.
Number of distinct unmodified peptide sequences reported across all analyses (original submission and reanalyses) associated with this dataset.

"N/A" means no results of this type were submitted.
Number of distinct peptide sequences (including modified variants or peptidoforms) reported across all analyses (original submission and reanalyses) associated with this dataset.

"N/A" means no results of this type were submitted.
Total number of peptide-spectrum matches (i.e. spectrum identifications) reported across all analyses (original submission and reanalyses) associated with this dataset.

"N/A" means no results of this type were submitted.
Number of distinct proteins quantified across all analyses (original submission and reanalyses) associated with this dataset.

Distinct protein accessions are counted across all files submitted in the "Statistical Analysis of Quantified Analytes" category having a "Protein" column in this dataset.

"N/A" means no results of this type were submitted.
Number of distinct proteins found to be differentially abundant in at least one comparison across all analyses (original submission and reanalyses) associated with this dataset.

A protein is differentially abundant if its change in abundance across conditions is found to be statistically significant with an adjusted p-value <= 0.05 and lists no issues associated with statistical tests for differential abundance.

Distinct protein accessions are counted across all files submitted in the "Statistical Analysis of Quantified Analytes" category having a "Protein" column in this dataset.

"N/A" means no results of this type were submitted.
This dataset may not contain all raw spectra data as originally deposited in PRIDE. It has been imported to MassIVE for reanalysis purposes, so its spectra data here may consist solely of processed peak lists suitable for reanalysis with most software.