Acute stroke CDS: automatic retrieval of thrombolysis contraindications from unstructured clinical letters

Cutforth, M. et al. (2023) Acute stroke CDS: automatic retrieval of thrombolysis contraindications from unstructured clinical letters. Frontiers in Digital Health, 5, 1186516. (doi: 10.3389/fdgth.2023.1186516) (PMID:37388253) (PMCID:PMC10305776)

[img] Text
301935.pdf - Published Version
Available under License Creative Commons Attribution.

3MB

Abstract

Introduction: Thrombolysis treatment for acute ischaemic stroke can lead to better outcomes if administered early enough. However, contraindications exist which put the patient at greater risk of a bleed (e.g. recent major surgery, anticoagulant medication). Therefore, clinicians must check a patient's past medical history before proceeding with treatment. In this work we present a machine learning approach for accurate automatic detection of this information in unstructured text documents such as discharge letters or referral letters, to support the clinician in making a decision about whether to administer thrombolysis. Methods: We consulted local and national guidelines for thrombolysis eligibility, identifying 86 entities which are relevant to the thrombolysis decision. A total of 8,067 documents from 2,912 patients were manually annotated with these entities by medical students and clinicians. Using this data, we trained and validated several transformer-based named entity recognition (NER) models, focusing on transformer models which have been pre-trained on a biomedical corpus as these have shown most promise in the biomedical NER literature. Results: Our best model was a PubMedBERT-based approach, which obtained a lenient micro/macro F1 score of 0.829/0.723. Ensembling 5 variants of this model gave a significant boost to precision, obtaining micro/macro F1 of 0.846/0.734 which approaches the human annotator performance of 0.847/0.839. We further propose numeric definitions for the concepts of name regularity (similarity of all spans which refer to an entity) and context regularity (similarity of all context surrounding mentions of an entity), using these to analyse the types of errors made by the system and finding that the name regularity of an entity is a stronger predictor of model performance than raw training set frequency. Discussion: Overall, this work shows the potential of machine learning to provide clinical decision support (CDS) for the time-critical decision of thrombolysis administration in ischaemic stroke by quickly surfacing relevant information, leading to prompt treatment and hence to better patient outcomes.

Item Type:Articles
Keywords:Named entity recognition (NER), thrombolysis, acute stroke, clinical decision support (CDS), machine learning (ML).
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Brown, Dr Cameron and Muir, Professor Keith
Authors: Cutforth, M., Watson, H., Brown, C., Wang, C., Thomson, S., Fell, D., Dilys, V., Scrimgeour, M., Schrempf, P., Lesh, J., Muir, K., Weir, A., and O’Neil, A. Q.
College/School:College of Medical Veterinary and Life Sciences > School of Psychology & Neuroscience
Journal Name:Frontiers in Digital Health
Publisher:Frontiers Media
ISSN:2673-253X
ISSN (Online):2673-253X
Copyright Holders:Copyright © 2023 Cutforth, Watson, Brown, Wang, Thomson, Fell, Dilys, Scrimgeour, Schrempf, Lesh, Muir, Weir and O'Neil
First Published:First published in Frontiers in Digital Health 5: 1186516
Publisher Policy:Reproduced under a Creative Commons License

University Staff: Request a correction | Enlighten Editors: Update this record

Project CodeAward NoProject NamePrincipal InvestigatorFunder's NameFunder RefLead Dept
304546I-CAIRD: Industrial Centre for AI Research in Digital DiagnosticsKeith MuirInnovate UK (INNOVATE)104690SPN - Centre for Stroke & Brain Imaging