Unsupervised discovery and comparison of structural families across multiple samples in untargeted metabolomics

van der Hooft, J. J.J. , Wandy, J. , Young, F., Padmanabhan, S. , Gerasimidis, K. , Burgess, K. E.V., Barrett, M. P. and Rogers, S. (2017) Unsupervised discovery and comparison of structural families across multiple samples in untargeted metabolomics. Analytical Chemistry, 89(14), pp. 7569-7577. (doi: 10.1021/acs.analchem.7b01391) (PMID:28621528) (PMCID:PMC5524435)

[img]
Preview
Text
142536.pdf - Published Version
Available under License Creative Commons Attribution.

2MB

Abstract

In untargeted metabolomics approaches, the inability to structurally annotate relevant features and map them to biochemical pathways is hampering the full exploitation of many metabolomics experiments. Furthermore, variable metabolic content across samples result in sparse feature matrices that are statistically hard to handle. Here, we introduce MS2LDA+ that tackles both above-mentioned problems. Previously, we presented MS2LDA, which extracts biochemically relevant molecular substructures (“Mass2Motifs”) from a collection of fragmentation spectra as sets of co-occurring molecular fragments and neutral losses, thereby recognizing building blocks of metabolomics. Here, we extend MS2LDA to handle multiple metabolomics experiments in one analysis, resulting in MS2LDA+. By linking Mass2Motifs across samples, we expose the variability in prevalence of structurally related metabolite families. We validate the differential prevalence of substructures between two distinct samples groups and apply it to fecal samples. Subsequently, within one sample group of urines, we rank the Mass2Motifs based on their variance to assess whether xenobiotic-derived substructures are among the most-variant Mass2Motifs. Indeed, we could ascribe 22 out of the 30 most-variant Mass2Motifs to xenobiotic-derived substructures including paracetamol/acetaminophen mercapturate and dimethylpyrogallol. In total, we structurally characterized 101 Mass2Motifs with biochemically or chemically relevant substructures. Finally, we combined the discovered metabolite families with full scan feature intensity information to obtain insight into core metabolites present in most samples and rare metabolites present in small subsets now linked through their common substructures. We conclude that by biochemical grouping of metabolites across samples MS2LDA+ aids in structural annotation of metabolites and guides prioritization of analysis by using Mass2Motif prevalence.

Item Type:Articles
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Rogers, Dr Simon and Wandy, Dr Joe and Gerasimidis, Professor Konstantinos and Van Der Hooft, Mr Justin and Burgess, Dr Karl and Padmanabhan, Professor Sandosh and Barrett, Professor Michael
Authors: van der Hooft, J. J.J., Wandy, J., Young, F., Padmanabhan, S., Gerasimidis, K., Burgess, K. E.V., Barrett, M. P., and Rogers, S.
College/School:College of Medical Veterinary and Life Sciences > School of Infection & Immunity
College of Medical Veterinary and Life Sciences > School of Medicine, Dentistry & Nursing
College of Science and Engineering > School of Computing Science
College of Medical Veterinary and Life Sciences > School of Cardiovascular & Metabolic Health
Journal Name:Analytical Chemistry
Publisher:American Chemical Society
ISSN:0003-2700
ISSN (Online):1520-6882
Published Online:16 June 2017
Copyright Holders:Copyright © 2017 American Chemical Society
First Published:First published in Analytical Chemistry 89(14): 7569-7577
Publisher Policy:Reproduced under a Creative Commons license

University Staff: Request a correction | Enlighten Editors: Update this record

Project CodeAward NoProject NamePrincipal InvestigatorFunder's NameFunder RefLead Dept
680241Unifying metabolome and proteome informaticsSimon RogersBiotechnology and Biological Sciences Research Council (BBSRC)BB/L018616/1COM - COMPUTING SCIENCE
623593Institutional Strategic Support Fund (ISSF)Anna DominiczakWellcome Trust (WELLCOTR)105614/Z/14/ZRI CARDIOVASCULAR & MEDICAL SCIENCES
371799The Wellcome Centre for Molecular Parasitology ( Core Support )Andrew WatersWellcome Trust (WELLCOTR)104111/Z/14/Z & AIII - PARASITOLOGY
690421Glasgow Molecular Pathology (GMP) NodeKarin OienMedical Research Council (MRC)MR/N005813/1ICS - EXPERIMENTAL THERAPEUTICS