MassIVE Reanalysis - RMSV000000359.1

Open modification searching of SARS-CoV-2-human protein interaction data reveals novel viral modification sites

Description

We performed a reanalysis of a SARS-CoV-2-human protein-protein interaction map from MSV000085144/PXD018117. Open modification searching was used to investigate the presence of PTMs in the context of the SARS-CoV-2 virus-host PPI network. Based on an over two-fold increase in identified spectra, our detected protein interactions show a high overlap with independent mass spectrometry-based SARS-CoV-2 studies and virus-host interactions for alternative viruses, as well as previously unknown protein interactions. Additionally, we identified several novel modification sites on SARS-CoV-2 proteins that we investigated in relation to their interactions with host proteins. A detailed analysis of relevant modifications, including phosphorylation, ubiquitination, and S-nitrosylation, provides important hypotheses about the functional role of these modifications during viral infection by SARS-CoV-2. In the original study affinity purification was performed using 27 SARS-CoV-2 proteins that were individually tagged and expressed in triplicate in HEK-293T cells. Bead-bound proteins were denatured, reduced, carbamidomethylated, and enzymatically digested using trypsin, and each sample was injected via an Easy-nLC 1200 (Thermo Fisher Scientific) into a Q-Exactive Plus mass spectrometer (Thermo Fisher Scientific). The SARS-CoV-2 proteins that were included are: all mature nonstructural proteins (Nsps), except for Nsp3 and Nsp16; a mutated version of Nsp5 to disable its proteolytic activity (Nsp5_C145A); and all predicted SARS-CoV-2 open reading frames (Orfs), including the spike (S), membrane (M), nucleocapsid (N), and envelope (E) protein. First, the downloaded raw files were converted to MGF files using ThermoRawFileParser (version 1.2.3). Next, OMS was performed using the ANN-SoLo spectral library search engine (version 0.2.4). A combined human-SARS-CoV-2 spectral library was used for searching. The MassIVE-KB library (version 2018/06/15) was used as human spectral library. SARS-CoV-2 spectra were simulated by generating all possible tryptic peptide sequences from the SARS-CoV-2 protein sequences downloaded from UniProt (version 2020/03/05) using Pyteomics (version 4.3.2) and predicting the corresponding spectra using Prosit (version prosit_intensity_2020_hcd; collision energy 33 as determined by Prosit collision energy calibration). A simulated spectral library for the green fluorescent protein was generated in a similar fashion. A final spectral library was compiled by merging all spectra using SpectraST (version 5.0) and adding decoy spectra in a 1:1 ratio using the shuffle-and-reposition method. ANN-SoLo was configured to use a 20 ppm precursor mass tolerance during the first step of its cascade search and a 500 Da precursor mass tolerance during its open search. Other search settings were to filter peaks below 101 m/z, above 1500 m/z, and in a 0.5 m/z window around the precursor mass; a 0.02 m/z fragment mass tolerance; and a bin size of 0.05 m/z. The remaining settings were kept at their default values. Peptide-spectrum matches (PSMs) were filtered at 1% FDR. [doi:10.25345/C5ZP3W37Q]

[See results attachment job for details]

Keywords: SARS-CoV-2 ; coronavirus ; COVID-19 ; PPI ; open modification searching

Reanalyzed Datasets

  • MSV000085144 : A SARS-CoV-2-Human Protein-Protein Interaction Map Reveals Drug Targets and Potential Drug-Purposing
Number of Files: 201
Total Size: 14.09 GB
 
Identification Results
    Proteins (Reported):
108,214
    Peptides:
99,143
    Variant Peptides:
108,214
    PSMs:
830,743
 
Browse Reanalysis Files Browse Results
 
FTP Download Link (click to copy):
Number of distinct conditions analyzed in this reanalysis.

Distinct condition labels are counted across all files submitted in the "Metadata" category having a "Condition" column in this reanalysis.

"N/A" means no results of this type were submitted.
Number of distinct biological replicates in this reanalysis.

Distinct replicate labels are counted across all files submitted in the "Metadata" category having a "BioReplicate" or "Replicate" column in this reanalysis.

"N/A" means no results of this type were submitted.
Number of distinct technical replicates in this reanalysis.

The technical replicate count is defined as the maximum number of times any one distinct combination of condition and biological replicate was analyzed in files submitted in the "Metadata" category. In the case of fractionated experiments, only the first fraction is considered.

"N/A" means no results of this type were submitted.
Originally identified proteins that were automatically remapped by MassIVE to proteins in the SwissProt human reference database.

"N/A" means no results of this type were submitted.
Number of distinct protein accessions reported in this reanalysis.

"N/A" means no results of this type were submitted.
Number of distinct unmodified peptide sequences reported in this reanalysis.

"N/A" means no results of this type were submitted.
Number of distinct peptide sequences (including modified variants or peptidoforms) reported in this reanalysis.

"N/A" means no results of this type were submitted.
Total number of peptide-spectrum matches (i.e. spectrum identifications) reported in this reanalysis.

"N/A" means no results of this type were submitted.
Number of distinct proteins quantified in this reanalysis.

Distinct protein accessions are counted across all files submitted in the "Statistical Analysis of Quantified Analytes" category having a "Protein" column in this reanalysis.

"N/A" means no results of this type were submitted.
Number of distinct proteins found to be differentially abundant in at least one comparison in this reanalysis.

A protein is differentially abundant if its change in abundance across conditions is found to be statistically significant with an adjusted p-value <= 0.05 and lists no issues associated with statistical tests for differential abundance.

Distinct protein accessions are counted across all files submitted in the "Statistical Analysis of Quantified Analytes" category having a "Protein" column in this reanalysis.

"N/A" means no results of this type were submitted.