Ecological observations based on functional gene sequencing are sensitive to the amplicon processing method

Cholet, F., Lisik, A., Agogué, H., Ijaz, U. Z. , Pineau, P., Lachaussée, N. and Smith, C. J. (2022) Ecological observations based on functional gene sequencing are sensitive to the amplicon processing method. mSphere, 7(4), e0032422. (doi: 10.1128/msphere.00324-22) (PMID:35938727) (PMCID:PMC9429940)

[img] Text
275208.pdf - Published Version
Available under License Creative Commons Attribution.



Until recently, the de facto method for short-read-based amplicon reconstruction was a sequence similarity threshold approach (operational taxonomic units [OTUs]). This has changed with the amplicon sequence variant (ASV) method where distributions are fitted to abundance profiles of individual genes using a noise-error model. While OTU-based approaches are still useful for 16S rRNA/18S rRNA genes, where thresholds of 97% to 99% are used, their use for functional genes is still debatable as there is no consensus on clustering thresholds. Here, we compare OTU- and ASV-based reconstruction approaches and taxonomy assignment methods, the naive Bayesian classifier (NBC) and Bayesian lowest common ancestor (BLCA) algorithm, using a functional gene data set from the microbial nitrogen-cycling community in the Brouage mudflat (France). A range of OTU similarity thresholds and ASVs were used to compare amoA (ammonia-oxidizing archaea [AOA] and ammonia-oxidizing bacteria [AOB]), nxrB, nirS, nirK, and nrfA communities between differing sedimentary structures. Significant effects of the sedimentary structure on weighted UniFrac (WUniFrac) distances were observed for AOA amoA when using ASVs, an OTU at a threshold of 97% sequence identity (OTU-97%), and OTU-85%; AOB amoA when using OTU-85%; and nirS when using ASV, OTU-90%, and OTU-85%. For AOB amoA, significant effects of the sedimentary structures on UniFrac distances were observed when using OTU-97% but not ASVs, and the inverse was found for nrfA. Interestingly, conclusions drawn for nirK and nxrB were consistent between amplicon reconstruction methods. We also show that when the sequences in the reference database are related to the environment in question, the BLCA algorithm leads to more phylogenetically relevant classifications. However, when the reference database contains sequences more dissimilar to the ones retrieved, the NBC obtains more information.

Item Type:Articles
Additional Information:We acknowledge the following funders: Thomas Crawford Hayes NUIG (awarded to C.J.S. and A.L.), a Harry Smith vacation studentship (awarded to A.L.), a mobility grant from the French Embassy in Ireland (awarded to H.A. and C.J.S.), a research visit grant, a Microbiology Society and Research Student Mobility grant, the University of Glasgow (awarded to F.C.), La Rochelle University, and the Environmental Protection Agency STRIVE Doctoral Scholarship Scheme (2012-W-PhD-6) (awarded to C.J.S.). F.C. is supported by a University of Glasgow College of Science and Engineering doctoral scholarship, U.Z.I. was funded by NERC IRF NE/L011956/1, and C.J.S. is supported by a Royal Academy of EngineeringScottish Water Research Chair (RCSRF1718643) and EPSRC award EP/V030515/1.
Glasgow Author(s) Enlighten ID:Cholet, Dr Fabien and Smith, Professor Cindy and Ijaz, Dr Umer
Authors: Cholet, F., Lisik, A., Agogué, H., Ijaz, U. Z., Pineau, P., Lachaussée, N., and Smith, C. J.
College/School:College of Science and Engineering > School of Engineering > Infrastructure and Environment
Journal Name:mSphere
Publisher:American Society for Microbiology
ISSN (Online):2379-5042
Published Online:08 August 2022
Copyright Holders:Copyright © 2022 Cholet et al
First Published:First published in mSphere 7(4): e0032422
Publisher Policy:Reproduced under a Creative Commons licence

University Staff: Request a correction | Enlighten Editors: Update this record

Project CodeAward NoProject NamePrincipal InvestigatorFunder's NameFunder RefLead Dept
170256Understanding microbial community through in situ environmental 'omic data synthesisUmer Zeeshan IjazNatural Environment Research Council (NERC)NE/L011956/1ENG - Infrastructure & Environment
309846Decentralised water technologiesWilliam SloanEngineering and Physical Sciences Research Council (EPSRC)EP/V030515/1ENG - Infrastructure & Environment