Shrinking a large dataset to identify variables associated with increased risk of Plasmodium falciparum infection in Western Kenya

Tremblay, M., Dahm, J.S., Wamae, C.N., De Glanville, W.A., Fevre, E.M. and Dopfer, D. (2015) Shrinking a large dataset to identify variables associated with increased risk of Plasmodium falciparum infection in Western Kenya. Epidemiology and Infection, 143(16), pp. 3538-3545. (doi: 10.1017/S0950268815000710) (PMID:25876816)

[img]
Preview
Text
106238.pdf - Published Version
Available under License Creative Commons Attribution.

193kB

Abstract

Large datasets are often not amenable to analysis using traditional single-step approaches. Here, our general objective was to apply imputation techniques, principal component analysis (PCA), elastic net and generalized linear models to a large dataset in a systematic approach to extract the most meaningful predictors for a health outcome. We extracted predictors for Plasmodium falciparum infection, from a large covariate dataset while facing limited numbers of observations, using data from the People, Animals, and their Zoonoses (PAZ) project to demonstrate these techniques: data collected from 415 homesteads in western Kenya, contained over 1500 variables that describe the health, environment, and social factors of the humans, livestock, and the homesteads in which they reside. The wide, sparse dataset was simplified to 42 predictors of P. falciparum malaria infection and wealth rankings were produced for all homesteads. The 42 predictors make biological sense and are supported by previous studies. This systematic data-mining approach we used would make many large datasets more manageable and informative for decision-making processes and health policy prioritization.

Item Type:Articles
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:De Glanville, Dr William
Authors: Tremblay, M., Dahm, J.S., Wamae, C.N., De Glanville, W.A., Fevre, E.M., and Dopfer, D.
College/School:College of Medical Veterinary and Life Sciences > Institute of Biodiversity Animal Health and Comparative Medicine
Journal Name:Epidemiology and Infection
Publisher:Cambridge University Press
ISSN:0950-2688
ISSN (Online):1469-4409
Copyright Holders:Copyright © 2015 Cambridge University Press
First Published:First published in Epidemiology and Infection 2015
Publisher Policy:Reproduced under a Creative Commons License

University Staff: Request a correction | Enlighten Editors: Update this record