Making the most of its short reads: a bioinformatics workflow for analysing the short-read-only data of Leishmania orientalis (formerly named Leishmania siamensis) isolate PCM2 in Thailand

Anuntasomboon, P., Siripattanapipong, S., Unajak, S., Choowongkomon, K., Burchmore, R. , Leelayoova, S., Mungthin, M. and E-kobon, T. (2022) Making the most of its short reads: a bioinformatics workflow for analysing the short-read-only data of Leishmania orientalis (formerly named Leishmania siamensis) isolate PCM2 in Thailand. Biology, 11(9), 1272. (doi: 10.3390/biology11091272) (PMID:36138751) (PMCID:PMC9495971)

[img] Text
275569.pdf - Published Version
Available under License Creative Commons Attribution.

2MB

Abstract

Background: Leishmania orientalis (formerly named Leishmania siamensis) has been neglected for years in Thailand. The genomic study of L. orientalis has gained much attention recently after the release of the first high-quality reference genome of the isolate LSCM4. The integrative approach of multiple sequencing platforms for whole-genome sequencing has proven effective at the expense of considerably expensive costs. This study presents a preliminary bioinformatic workflow including the use of multi-step de novo assembly coupled with the reference-based assembly method to produce high-quality genomic drafts from the short-read Illumina sequence data of L. orientalis isolate PCM2. Results: The integrating multi-step de novo assembly by MEGAHIT and SPAdes with the reference-based method using the L. enriettii genome and salvaging the unmapped reads resulted in the 30.27 Mb genomic draft of L. orientalis isolate PCM2 with 3367 contigs and 8887 predicted genes. The results from the integrated approach showed the best integrity, coverage, and contig alignment when compared to the genome of L. orientalis isolate LSCM4 collected from the northern province of Thailand. Similar patterns of gene ratios and frequency were observed from the GO biological process annotation. Fifty GO terms were assigned to the assembled genomes, and 23 of these (accounting for 61.6% of the annotated genes) showed higher gene counts and ratios when results from our workflow were compared to those of the LSCM4 isolate. Conclusions: These results indicated that our proposed bioinformatic workflow produced an acceptable-quality genome of L. orientalis strain PCM2 for functional genomic analysis, maximising the usage of the short-read data. This workflow would give extensive information required for identifying strain-specific markers and virulence-associated genes useful for drug and vaccine development before a more exhaustive and expensive investigation.

Item Type:Articles
Additional Information:This research was funded by Kasetsart University Research and Development Institute (KURDI), Kasetsart University, grant number FF(KU)6.64.
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Burchmore, Dr Richard
Creator Roles:
Burchmore, R.Conceptualization, Supervision
Authors: Anuntasomboon, P., Siripattanapipong, S., Unajak, S., Choowongkomon, K., Burchmore, R., Leelayoova, S., Mungthin, M., and E-kobon, T.
College/School:College of Medical Veterinary and Life Sciences > School of Infection & Immunity
Journal Name:Biology
Publisher:MDPI
ISSN:2079-7737
ISSN (Online):2079-7737
Published Online:26 August 2022
Copyright Holders:Copyright © 2022 The Authors
First Published:First published in Biology 11(9): 1272
Publisher Policy:Reproduced under a Creative Commons License

University Staff: Request a correction | Enlighten Editors: Update this record