MassIVE MSV000092669

Partial Public

LC-HRMS/MS Polysorbate Dataset from - PolyMatch: Novel Libraries, Algorithms and Visualizations for Discovering Polymers and Chemical Series

Description

This is a dataset for Polysorbate 80 with iterative exclusion MS/MS applied on a LC-HRMS/MS Q-TOF platform (Agilent 6546). The corresponding interactive dataset can be found here: https://innovativeomics.com/datasets/polysorbate-beta. Polymers and chemical series are integral components of everyday products, ranging from plastics and emulsifiers to lubricants and detergents. The characterization of these materials at the molecular level is essential to understand their physicochemical properties and potential health impacts, considering factors such as the number of repeating units, chemical moieties, fatty acids, and degree of unsaturation. This study introduces a free open-source software PolyMatch, designed to annotate polysorbates, polysorbides, polyethylene glycols (PEGs), fatty acid esterified species, and related chemical species based on mass spectral and chromatographic patterns inherent in the repeating nature of chemical moieties. PolyMatch facilitates the generation of MS/MS libraries for polymeric chemical species characterization (with over 800,000 structures with associated fragment masses already built-in) and covers the entire liquid chromatography high-resolution mass spectrometry (LC-HRMS/MS) data-processing workflow. PolyMatch covers peak picking, blank filtering, annotation, data visualization, and sharing of interactive datasets via an html link to the community. The software was applied to a Tween 80 mixture, using liquid chromatography high-resolution tandem mass spectrometry (LC-HRMS/MS) on an Agilent 6546 Q-TOF instrument with iterative exclusion for comprehensive fragmentation coverage. PolyMatch automatically assigned 86 features with high confidence at the species level, 362 based on PEG containing fragments and accurate mass matching to a simulated polymer database, and over 10,000 based on being a member of a homologous series (3 or more) with CH2CH2O repeating units. The ease of use of PolyMatch and comprehensive coverage with species level assignment will contribute to the advancement of materials science, health research, and product development. [doi:10.25345/C50V89T33] [dataset license: CC0 1.0 Universal (CC0 1.0)]

Keywords: polysorbate ; polysorbate 80 ; LC-HRMS/MS ; PolyMatch ; FluoroMatch ; Polymer ; Plastic ; Pharmaceutical formulations ; vaccine ; PEG ; polyethylene glycol

Contact

Principal Investigators:
(in alphabetical order)
David Weil, Agilent Technologies, United States
Jeremy Koelmel, Yale University, United States
Krystal G. Pollitt, Yale University, United States
Submitting User: jeremykoelmel
Number of Files:
Total Size:
Spectra:
Subscribers:
 
Owner Reanalyses
Experimental Design
    Conditions:
    Biological Replicates:
    Technical Replicates:
 
Identification Results
    Proteins (Human, Remapped):
    Proteins (Reported):
    Peptides:
    Variant Peptides:
    PSMs:
 
Quantification Results
    Differential Proteins:
    Quantified Proteins:
 
Browse Dataset Files
 
FTP Download Link (click to copy):

- Dataset Reanalyses


+ Dataset History


Click here to queue conversion of this dataset's submitted spectrum files to open formats (e.g. mzML). This process may take some time.

When complete, the converted files will be available in the "ccms_peak" subdirectory of the dataset's FTP space (accessible via the "FTP Download" link to the right).
Number of distinct conditions across all analyses (original submission and reanalyses) associated with this dataset.

Distinct condition labels are counted across all files submitted in the "Metadata" category having a "Condition" column in this dataset.

"N/A" means no results of this type were submitted.
Number of distinct biological replicates across all analyses (original submission and reanalyses) associated with this dataset.

Distinct replicate labels are counted across all files submitted in the "Metadata" category having a "BioReplicate" or "Replicate" column in this dataset.

"N/A" means no results of this type were submitted.
Number of distinct technical replicates across all analyses (original submission and reanalyses) associated with this dataset.

The technical replicate count is defined as the maximum number of times any one distinct combination of condition and biological replicate was analyzed across all files submitted in the "Metadata" category. In the case of fractionated experiments, only the first fraction is considered.

"N/A" means no results of this type were submitted.
Originally identified proteins that were automatically remapped by MassIVE to proteins in the SwissProt human reference database.

"N/A" means no results of this type were submitted.
Number of distinct protein accessions reported across all analyses (original submission and reanalyses) associated with this dataset.

"N/A" means no results of this type were submitted.
Number of distinct unmodified peptide sequences reported across all analyses (original submission and reanalyses) associated with this dataset.

"N/A" means no results of this type were submitted.
Number of distinct peptide sequences (including modified variants or peptidoforms) reported across all analyses (original submission and reanalyses) associated with this dataset.

"N/A" means no results of this type were submitted.
Total number of peptide-spectrum matches (i.e. spectrum identifications) reported across all analyses (original submission and reanalyses) associated with this dataset.

"N/A" means no results of this type were submitted.
Number of distinct proteins quantified across all analyses (original submission and reanalyses) associated with this dataset.

Distinct protein accessions are counted across all files submitted in the "Statistical Analysis of Quantified Analytes" category having a "Protein" column in this dataset.

"N/A" means no results of this type were submitted.
Number of distinct proteins found to be differentially abundant in at least one comparison across all analyses (original submission and reanalyses) associated with this dataset.

A protein is differentially abundant if its change in abundance across conditions is found to be statistically significant with an adjusted p-value <= 0.05 and lists no issues associated with statistical tests for differential abundance.

Distinct protein accessions are counted across all files submitted in the "Statistical Analysis of Quantified Analytes" category having a "Protein" column in this dataset.

"N/A" means no results of this type were submitted.
This dataset may not contain all raw spectra data as originally deposited in PRIDE. It has been imported to MassIVE for reanalysis purposes, so its spectra data here may consist solely of processed peak lists suitable for reanalysis with most software.