Using transfer learning and dimensionality reduction techniques to improve generalisability of machine-learning predictions of mosquito ages from mid-infrared spectra

Mwanga, E. P., Siria, D. J., Mitton, J., Mshani, I. H., González-Jiménez, M. , Selvarajah, P., Wynne, K. , Baldini, F. , Okumu, F. O. and Babayan, S. A. (2023) Using transfer learning and dimensionality reduction techniques to improve generalisability of machine-learning predictions of mosquito ages from mid-infrared spectra. BMC Bioinformatics, 24, 11. (doi: 10.1186/s12859-022-05128-5) (PMID:36624386) (PMCID:PMC9830685)

[img] Text
288743.pdf - Published Version
Available under License Creative Commons Attribution.

3MB

Abstract

Background Old mosquitoes are more likely to transmit malaria than young ones. Therefore, accurate prediction of mosquito population age can drastically improve the evaluation of mosquito-targeted interventions. However, standard methods for age-grading mosquitoes are laborious and costly. We have shown that Mid-infrared spectroscopy (MIRS) can be used to detect age-specific patterns in mosquito cuticles and thus can be used to train age-grading machine learning models. However, these models tend to transfer poorly across populations. Here, we investigate whether applying dimensionality reduction and transfer learning to MIRS data can improve the transferability of MIRS-based predictions for mosquito ages. Methods We reared adults of the malaria vector Anopheles arabiensis in two insectaries. The heads and thoraces of female mosquitoes were scanned using an attenuated total reflection-Fourier transform infrared spectrometer, which were grouped into two different age classes. The dimensionality of the spectra data was reduced using unsupervised principal component analysis or t-distributed stochastic neighbour embedding, and then used to train deep learning and standard machine learning classifiers. Transfer learning was also evaluated to improve transferability of the models when predicting mosquito age classes from new populations. Results Model accuracies for predicting the age of mosquitoes from the same population as the training samples reached 99% for deep learning and 92% for standard machine learning. However, these models did not generalise to a different population, achieving only 46% and 48% accuracy for deep learning and standard machine learning, respectively. Dimensionality reduction did not improve model generalizability but reduced computational time. Transfer learning by updating pre-trained models with 2% of mosquitoes from the alternate population improved performance to ~ 98% accuracy for predicting mosquito age classes in the alternative population. Conclusion Combining dimensionality reduction and transfer learning can reduce computational costs and improve the transferability of both deep learning and standard machine learning models for predicting the age of mosquitoes. Future studies should investigate the optimal quantities and diversity of training data necessary for transfer learning and the implications for broader generalisability to unseen datasets.

Item Type:Articles
Additional Information:This research was supported by the Medical Research Council (MRC) grant (Grant No. MR/P025501/1). EPM and DJS were also supported by the Wellcome Trust International Masters Fellowships in Tropical Medicine and Hygiene, Grant Nos. WT214643/Z/18/Z and WT 214644/Z/18/Z respectively.
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Okumu, Professor Fredros and Baldini, Dr Francesco and Mitton, Mr Joshua and Wynne, Professor Klaas and Mshani, Mr Issa and Babayan, Dr Simon and Siria, Doreen Josen and Gonzalez Jimenez, Dr Mario and Mwanga, Emmanuel
Authors: Mwanga, E. P., Siria, D. J., Mitton, J., Mshani, I. H., González-Jiménez, M., Selvarajah, P., Wynne, K., Baldini, F., Okumu, F. O., and Babayan, S. A.
College/School:College of Medical Veterinary and Life Sciences > School of Biodiversity, One Health & Veterinary Medicine
College of Science and Engineering > School of Chemistry
Journal Name:BMC Bioinformatics
Publisher:Biomed Central
ISSN:1471-2105
ISSN (Online):1471-2105
Copyright Holders:Copyright © 2023 The Authors
First Published:First published in BMC Bioinformatics 24: 11
Publisher Policy:Reproduced under a Creative Commons License
Data DOI:10.5525/gla.researchdata.1235

University Staff: Request a correction | Enlighten Editors: Update this record

Project CodeAward NoProject NamePrincipal InvestigatorFunder's NameFunder RefLead Dept
174132Development of a new tool for malaria mosquito surveillance to improve vector controlHeather FergusonMedical Research Council (MRC)MR/P025501/1Institute of Biodiversity, Animal Health and Comparative Medicine