MassIVE MSV000084172

Partial Public

Mono-allelic datasets for: A large peptidome dataset improves HLA class I epitope prediction across most of the human population

Comment Reanalyze Spectra Add Reanalysis

Description

Sarkizova S, Klaeger S, Le PM, Li LW, Oliveira G, Keshishian H, Hartigan CH, Zhang W, Braun DA, Ligon KL, Bachireddy P, Zervantonakis IK, Rosenbluth JM, Ouspenskaia T, Law T, Justeson S, Stevens J, Lane WJ, Eisenhaure T, Zhang GL, Clauser KR, Hacohen N, Carr SA, Wu CJ, Keskin DB. Nature Biotechnology 2019. Prediction of HLA epitopes is important for the development of cancer immunotherapies and vaccines. However, current prediction algorithms have limited predictive power, in part because they were not trained on high-quality epitope datasets covering a broad range of HLA alleles. To enable prediction of endogenous HLA class I-associated peptides across a large fraction of the human population, we used mass spectrometry to profile >185,000 peptides eluted from 95 HLA-A, -B, -C and -G mono-allelic cell lines. We identified canonical peptide motifs per HLA allele, unique and shared binding submotifs across alleles and distinct motifs associated with different peptide lengths. By integrating these data with transcript abundance and peptide processing, we developed HLAthena, providing allele-and-length-specific and pan-allele-pan-length prediction models for endogenous peptide presentation. These models predicted endogenous HLA class I-associated ligands with 1.5-fold improvement in positive predictive value compared with existing tools and correctly identified >75% of HLA-bound peptides that were observed experimentally in 11 patient-derived tumor cell lines. [doi:10.25345/C5N36Q] [dataset license: CC0 1.0 Universal (CC0 1.0)]

Keywords: HLA class I

Contact

Principal Investigators: (in alphabetical order)	Steven A. Carr, Broad Institute of MIT and Harvard, United States
Submitting User:	clauser

Number of Files:
Total Size:
Spectra:
Subscribers:

	Owner	Reanalyses
Experimental Design
Conditions:
Biological Replicates:
Technical Replicates:

Identification Results
Proteins (Human, Remapped):
Proteins (Reported):
Peptides:
Variant Peptides:
PSMs:

Quantification Results
Differential Proteins:
Quantified Proteins:

Browse Dataset Files

FTP Download Link (click to copy):

Species

Homo sapiens

Instrument

Modifications

- Dataset Reanalyses

+ Dataset History

Number of distinct proteins found to be differentially abundant in at least one comparison across all analyses (original submission and reanalyses) associated with this dataset.

A protein is differentially abundant if its change in abundance across conditions is found to be statistically significant with an adjusted p-value <= 0.05 and lists no issues associated with statistical tests for differential abundance.

Distinct protein accessions are counted across all files submitted in the "Statistical Analysis of Quantified Analytes" category having a "Protein" column in this dataset.

"N/A" means no results of this type were submitted.