Mass spectrometry-based shotgun proteomics is currently based on assigning matches between mass-spectra of protein fragments resulting from protease digestion and amino acid sequences predicted from nucleic acid sequences. At the same time, the method lacks reliability in identification of every single amino acid of proteins proteome-wide. We proposed a way to interpret shotgun proteomics results, specifically in data-dependent acquisition mode, as a protein sequence coverage by multiple reads, just as it is done in the field of nucleic acid sequencing for the calling of single nucleotide variants. Multiple reads for each letter in the proteome could be provided by overlapping distinct peptides, which confirm the presence of certain amino acid residues in the overlapping stretch with much lower false discovery rate than conventional 1%. These overlapping distinct peptides were, first, miscleaved tryptic peptides in combination with their properly cleaved counterparts, and, second, the peptides generated by several proteases with different specificities after digestion of the same specimen and analyzed separately. We illustrated this approach using publicly available multiprotease proteomic datasets and in-home data for HEK-293 cell line subproteomes obtained using trypsin, LysC and GluC proteases. A general coverage of proteome in exemplary datasets, even with a single read, was 20-30% at 5-8 thousand protein groups identified. Inside this percentage, 5-7% of the whole proteome were covered at least two-fold and, thus, identified with increased reliability. Of 36 single amino acid variants identified in the HEK-293 cell line, seven variants were covered at least two-fold. The sequence coverage by multiple reads may be further increased with gain in proteome depth and the number of multiple proteases used.
[doi:10.25345/C52S2T]
[dataset license: CC0 1.0 Universal (CC0 1.0)]
Keywords: shotgun proteomics, proteome coverage, multi protease analysis
Principal Investigators: (in alphabetical order) |
Sergei Moshkovkii, Research and Clinical Center of Physical-Chemical Medicine, Russia |
Submitting User: | Ksenia |
RETURN TO ARTICLES ASAPPREVARTICLENEXT Validating Amino Acid Variants in Proteogenomics Using Sequence Coverage by Multiple Reads Lev I. Levitsky, Ksenia G. Kuznetsova, Anna A. Kliuchnikova, Irina Y. Ilina, Anton O. Goncharov, Anna A. Lobas, Mark V. Ivanov, Vassili N. Lazarev, Rustam H. Ziganshin, Mikhail V. Gorshkov, and Sergei A. Moshkovskii.
Validating Amino Acid Variants in Proteogenomics Using Sequence Coverage by Multiple Reads.
J Proteome Res . 2022 May 10. doi: 10.1021/acs.jproteome.2c00033. Online ahead of print.
Number of Files: | |
Total Size: | |
Spectra: | |
Subscribers: | |
Owner | Reanalyses | |
---|---|---|
Experimental Design | ||
Conditions:
![]() |
||
Biological Replicates:
![]() |
||
Technical Replicates:
![]() |
||
Identification Results | ||
Proteins (Human, Remapped):
![]() |
||
Proteins (Reported):
![]() |
||
Peptides:
![]() |
||
Variant Peptides:
![]() |
||
PSMs:
![]() |
||
Quantification Results | ||
Differential Proteins:
![]() |
||
Quantified Proteins:
![]() |
||
Browse Dataset Files | Browse Results |
FTP Download Link (click to copy):
|