MassIVE MSV000098624

Partial Public

Large-scale multi-omic analysis identifies noncoding somatic driver mutations and nominates ZFP36L2 as a driver gene for pancreatic ductal adenocarcinoma

Description

Background The identification and characterization of somatic cancer driver mutations in the noncoding genome remains challenging. Objective To broadly characterize noncoding driver mutations for pancreatic ductal adenocarcinoma (PDAC). Design Using mutation calls from whole-genome sequence (WGS) data in PDACs and genome-scale maps of accessible gene regulatory regions in normal- and tumor-derived pancreatic samples, we analyzed enrichment of noncoding mutations in gene regulatory regions relevant to normal- and tumor-derived pancreatic contexts. Functional follow up of potential driver mutations was performed using chromatin interaction analyses, massively parallel reporter assays (MPRA) and targeted analysis of selected noncoding somatic mutations. Results We first created genome-scale maps of accessible chromatin regions (ACRs) and histone modification marks (HMMs) in pancreatic cell lines and purified pancreatic acinar and duct cells. Integration with whole-genome mutation calls from 506 PDACs revealed 314 ACRs/HMMs significantly enriched with 3,614 noncoding somatic mutations (NCSMs). Chromatin interaction analysis identified 416 potential target genes and MPRA revealed 178 NCSMs impacting reporter activity (19.45% of those tested). Targeted luciferase validation confirmed negative effects on gene regulatory activity for NCSMs near ZFP36L2 and CDKN2A. For the former, CRISPR interference (CRISPRi) identified ZFP36L2 as a target gene (16.0 - 24.0% reduced expression, P = 0.023-0.0047), and growth inhibition after overexpression of ZFP36L2 (4.1 - 14.1-fold reduction, P = 6.0x10-4 - 3.2x10-3) implicates a possible tumor suppressor function. Conclusion Our integrative approach provides a catalog of potential noncoding driver mutations and nominates ZFP36L2 as a novel PDAC driver gene with a likely tumor suppressor function. [doi:10.25345/C5HH6CJ9P] [dataset license: CC0 1.0 Universal (CC0 1.0)]

Keywords: Mutations, Pancreatic Cancer, Gene Regulation, Chromatin, Epigenetics. ; DatasetType:Proteomics

Contact

Principal Investigators:
(in alphabetical order)
Laufey Amundadottir, NCI/NIH, United States of America
Submitting User: laufey_a
Number of Files:
Total Size:
Spectra:
Subscribers:
 
Owner Reanalyses
Experimental Design
    Conditions:
    Biological Replicates:
    Technical Replicates:
 
Identification Results
    Proteins (Human, Remapped):
    Proteins (Reported):
    Peptides:
    Variant Peptides:
    PSMs:
 
Quantification Results
    Differential Proteins:
    Quantified Proteins:
 
Browse Dataset Files
 
FTP Download Link (click to copy):

- Dataset Reanalyses


+ Dataset History


Click here to queue conversion of this dataset's submitted spectrum files to open formats (e.g. mzML). This process may take some time.

When complete, the converted files will be available in the "ccms_peak" subdirectory of the dataset's FTP space (accessible via the "FTP Download" link to the right).
Number of distinct conditions across all analyses (original submission and reanalyses) associated with this dataset.

Distinct condition labels are counted across all files submitted in the "Metadata" category having a "Condition" column in this dataset.

"N/A" means no results of this type were submitted.
Number of distinct biological replicates across all analyses (original submission and reanalyses) associated with this dataset.

Distinct replicate labels are counted across all files submitted in the "Metadata" category having a "BioReplicate" or "Replicate" column in this dataset.

"N/A" means no results of this type were submitted.
Number of distinct technical replicates across all analyses (original submission and reanalyses) associated with this dataset.

The technical replicate count is defined as the maximum number of times any one distinct combination of condition and biological replicate was analyzed across all files submitted in the "Metadata" category. In the case of fractionated experiments, only the first fraction is considered.

"N/A" means no results of this type were submitted.
Originally identified proteins that were automatically remapped by MassIVE to proteins in the SwissProt human reference database.

"N/A" means no results of this type were submitted.
Number of distinct protein accessions reported across all analyses (original submission and reanalyses) associated with this dataset.

"N/A" means no results of this type were submitted.
Number of distinct unmodified peptide sequences reported across all analyses (original submission and reanalyses) associated with this dataset.

"N/A" means no results of this type were submitted.
Number of distinct peptide sequences (including modified variants or peptidoforms) reported across all analyses (original submission and reanalyses) associated with this dataset.

"N/A" means no results of this type were submitted.
Total number of peptide-spectrum matches (i.e. spectrum identifications) reported across all analyses (original submission and reanalyses) associated with this dataset.

"N/A" means no results of this type were submitted.
Number of distinct proteins quantified across all analyses (original submission and reanalyses) associated with this dataset.

Distinct protein accessions are counted across all files submitted in the "Statistical Analysis of Quantified Analytes" category having a "Protein" column in this dataset.

"N/A" means no results of this type were submitted.
Number of distinct proteins found to be differentially abundant in at least one comparison across all analyses (original submission and reanalyses) associated with this dataset.

A protein is differentially abundant if its change in abundance across conditions is found to be statistically significant with an adjusted p-value <= 0.05 and lists no issues associated with statistical tests for differential abundance.

Distinct protein accessions are counted across all files submitted in the "Statistical Analysis of Quantified Analytes" category having a "Protein" column in this dataset.

"N/A" means no results of this type were submitted.
This dataset may not contain all raw spectra data as originally deposited in PRIDE. It has been imported to MassIVE for reanalysis purposes, so its spectra data here may consist solely of processed peak lists suitable for reanalysis with most software.