MassIVE Reanalysis - RMSV000000309.4

RPXD018240.4

Open modification searching of SARS-CoV-2-human protein interaction data with MSFragger

Description

We performed a reanalysis of a SARS-CoV-2-human protein-protein interaction map from MSV000085144/PXD018117. With open modification searching (OMS) any type of PTM can be identified in an unbiased fashion, without the need to explicitly specify a limited number of variable modifications. This makes it possible to explore the general presence of PTMs in the context of the SARS-CoV-2 virus-host interactome. The reanalysis with ANN-SoLo (RMSV000000359.1) resulted in an over two-fold increase in identified spectra. Based on these identifications, our detected protein interactions showed a high overlap with independent mass spectrometry-based SARS-CoV-2 studies and virus-host interactions for alternative viruses, as well as previously unknown protein interactions. Additionally, we identified several novel modification sites on SARS-CoV-2 proteins that we investigated in relation to their interactions with host proteins. This reanalysis with MSFragger showed a decent overlap with the results from ANN-SoLo, illustrating that our results are not unique to ANN-SoLo and that a similar analysis could also be done with alternative OMS tools. In the original study affinity purification was performed using 27 SARS-CoV-2 proteins that were individually tagged and expressed in triplicate in HEK-293T cells. Bead-bound proteins were denatured, reduced, carbamidomethylated, and enzymatically digested using trypsin, and each sample was injected via an Easy-nLC 1200 (Thermo Fisher Scientific) into a Q-Exactive Plus mass spectrometer (Thermo Fisher Scientific). The SARS-CoV-2 proteins that were included are: all mature nonstructural proteins (Nsps), except for Nsp3 and Nsp16; a mutated version of Nsp5 to disable its proteolytic activity (Nsp5_C145A); and all predicted SARS-CoV-2 open reading frames (Orfs), including the spike (S), membrane (M), nucleocapsid (N), and envelope (E) protein. First, the downloaded raw files were converted to MGF files using ThermoRawFileParser (version 1.2.3). Next, OMS was performed using MSFragger (version 3.5) and FragPipe (version 18.0) against a concatenated FASTA file containing human protein sequences (Uniprot reviewed sequences downloaded on 2020/02/28), the SARS-CoV-2 protein sequences (version 2020/03/05), and the green fluorescent protein sequence. An equal number of decoy protein sequences were generated using FragPipe. The MSFragger search settings included a precursor mass tolerance between -150 Da and 500 Da, a fragment mass tolerance of 0.02 Da, and trypsin cleavage with up to two missed cleavages. Cysteine carbamidomethylation was used as a fixed modification, and oxidation of methionine and N-terminal acetylation were used as variable modifications. Other search settings were kept at their default values. PSMs were processed using PeptideProphet (version 4.4.0) with the FragPipe default settings for open searches and filtered at 1% FDR. [doi:10.25345/C53X83Q6B]

[See results attachment job for details]

Keywords: SARS-CoV-2 ; coronavirus ; COVID-19 ; PPI ; open modification searching

Reanalyzed Datasets

  • MSV000085144 : A SARS-CoV-2-Human Protein-Protein Interaction Map Reveals Drug Targets and Potential Drug-Purposing
Number of Files: 3
Total Size: 172.44 MB
 
Identification Results
    Proteins (Reported):
N/A
    Peptides:
N/A
    Variant Peptides:
N/A
    PSMs:
N/A
 
Browse Reanalysis Files
Browse Metadata
 
FTP Download Link (click to copy):
Number of distinct conditions analyzed in this reanalysis.

Distinct condition labels are counted across all files submitted in the "Metadata" category having a "Condition" column in this reanalysis.

"N/A" means no results of this type were submitted.
Number of distinct biological replicates in this reanalysis.

Distinct replicate labels are counted across all files submitted in the "Metadata" category having a "BioReplicate" or "Replicate" column in this reanalysis.

"N/A" means no results of this type were submitted.
Number of distinct technical replicates in this reanalysis.

The technical replicate count is defined as the maximum number of times any one distinct combination of condition and biological replicate was analyzed in files submitted in the "Metadata" category. In the case of fractionated experiments, only the first fraction is considered.

"N/A" means no results of this type were submitted.
Originally identified proteins that were automatically remapped by MassIVE to proteins in the SwissProt human reference database.

"N/A" means no results of this type were submitted.
Number of distinct protein accessions reported in this reanalysis.

"N/A" means no results of this type were submitted.
Number of distinct unmodified peptide sequences reported in this reanalysis.

"N/A" means no results of this type were submitted.
Number of distinct peptide sequences (including modified variants or peptidoforms) reported in this reanalysis.

"N/A" means no results of this type were submitted.
Total number of peptide-spectrum matches (i.e. spectrum identifications) reported in this reanalysis.

"N/A" means no results of this type were submitted.
Number of distinct proteins quantified in this reanalysis.

Distinct protein accessions are counted across all files submitted in the "Statistical Analysis of Quantified Analytes" category having a "Protein" column in this reanalysis.

"N/A" means no results of this type were submitted.
Number of distinct proteins found to be differentially abundant in at least one comparison in this reanalysis.

A protein is differentially abundant if its change in abundance across conditions is found to be statistically significant with an adjusted p-value <= 0.05 and lists no issues associated with statistical tests for differential abundance.

Distinct protein accessions are counted across all files submitted in the "Statistical Analysis of Quantified Analytes" category having a "Protein" column in this reanalysis.

"N/A" means no results of this type were submitted.