Development of an algorithm to classify primary care electronic health records of alcohol consumption: experience using data linkage from UK Biobank and primary care electronic health data sources

Fraile Navarro, D., Azcoaga Lorenzo, A., Agrawal, U., Jani, B. D. , Fagbamigbe, A., Currie, D., Baldacchino, A. M. and Sullivan, F. M. (2022) Development of an algorithm to classify primary care electronic health records of alcohol consumption: experience using data linkage from UK Biobank and primary care electronic health data sources. BMJ Open, 12(2), e054376. (doi: 10.1136/bmjopen-2021-054376) (PMID:35105585) (PMCID:PMC8808438)

[img] Text
263660.pdf - Published Version
Available under License Creative Commons Attribution Non-commercial.

1MB

Abstract

Objectives: Develop a novel algorithm to categorise alcohol consumption using primary care electronic health records (EHRs) and asses its reliability by comparing this classification with self-reported alcohol consumption data obtained from the UK Biobank (UKB) cohort. Design: Cross-sectional study. Setting: The UKB, a population-based cohort with participants aged between 40 and 69 years recruited across the UK between 2006 and 2010. Participants: UKB participants from Scotland with linked primary care data. Primary and secondary outcome measures: Create a rule-based multiclass algorithm to classify alcohol consumption reported by Scottish UKB participants and compare it with their classification using data present in primary care EHRs based on Read Codes. We evaluated agreement metrics (simple agreement and kappa statistic). Results: Among the Scottish UKB participants, 18 838 (69%) had at least one Read Code related to alcohol consumption and were used in the classification. The agreement of alcohol consumption categories between UKB and primary care data, including assessments within 5 years was 59.6%, and kappa was 0.23 (95% CI 0.21 to 0.24). Differences in classification between the two sources were statistically significant (p<0.001); More individuals were classified as ‘sensible drinkers’ and in lower alcohol consumption levels in primary care records compared with the UKB. Agreement improved slightly when using only numerical values (k=0.29; 95% CI 0.27 to 0.31) and decreased when using qualitative descriptors only (k=0.18;95% CI 0.16 to 0.20). Conclusion: Our algorithm classifies alcohol consumption recorded in Primary Care EHRs into discrete meaningful categories. These results suggest that alcohol consumption may be underestimated in primary care EHRs. Using numerical values (alcohol units) may improve classification when compared with qualitative descriptors.

Item Type:Articles
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Jani, Dr Bhautesh
Authors: Fraile Navarro, D., Azcoaga Lorenzo, A., Agrawal, U., Jani, B. D., Fagbamigbe, A., Currie, D., Baldacchino, A. M., and Sullivan, F. M.
College/School:College of Medical Veterinary and Life Sciences > School of Health & Wellbeing > General Practice and Primary Care
Journal Name:BMJ Open
Publisher:BMJ Publishing Group
ISSN:2044-6055
ISSN (Online):2044-6055
Published Online:01 February 2022
Copyright Holders:Copyright © Author(s) (or their employer(s)) 2022
First Published:First published in BMJ Open 12(2): e054376
Publisher Policy:Reproduced under a Creative Commons licence

University Staff: Request a correction | Enlighten Editors: Update this record