Comparison of different scoring methods based on latent variable models of the PHQ-9: an individual participant data meta-analysis

Fischer, F. et al. (2022) Comparison of different scoring methods based on latent variable models of the PHQ-9: an individual participant data meta-analysis. Psychological Medicine, 52(15), pp. 3472-3483. (doi: 10.1017/S0033291721000131) (PMID:33612144) (PMCID:PMC9393567)

[img] Text
310039.pdf - Published Version
Available under License Creative Commons Attribution Non-commercial Share Alike.

409kB

Abstract

Background: Previous research on the depression scale of the Patient Health Questionnaire (PHQ-9) has found that different latent factor models have maximized empirical measures of goodness-of-fit. The clinical relevance of these differences is unclear. We aimed to investigate whether depression screening accuracy may be improved by employing latent factor model-based scoring rather than sum scores. Methods: We used an individual participant data meta-analysis (IPDMA) database compiled to assess the screening accuracy of the PHQ-9. We included studies that used the Structured Clinical Interview for DSM (SCID) as a reference standard and split those into calibration and validation datasets. In the calibration dataset, we estimated unidimensional, two-dimensional (separating cognitive/affective and somatic symptoms of depression), and bi-factor models, and the respective cut-offs to maximize combined sensitivity and specificity. In the validation dataset, we assessed the differences in (combined) sensitivity and specificity between the latent variable approaches and the optimal sum score (⩾10), using bootstrapping to estimate 95% confidence intervals for the differences. Results: The calibration dataset included 24 studies (4378 participants, 652 major depression cases); the validation dataset 17 studies (4252 participants, 568 cases). In the validation dataset, optimal cut-offs of the unidimensional, two-dimensional, and bi-factor models had higher sensitivity (by 0.036, 0.050, 0.049 points, respectively) but lower specificity (0.017, 0.026, 0.019, respectively) compared to the sum score cut-off of ⩾10. Conclusions: In a comprehensive dataset of diagnostic studies, scoring using complex latent variable models do not improve screening accuracy of the PHQ-9 meaningfully as compared to the simple sum score approach.

Item Type:Articles
Additional Information:Authors: the Depression Screening Data (DEPRESSD) PHQ Collaboration.
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Quinn, Professor Terry and Taylor-Rowan, Dr Martin
Authors: Fischer, F., Levis, B., Falk, C., Sun, Y., Ioannidis, J. P. A., Cuijpers, P., Shrier, I., Benedetti, A., Thombs, B. D., , , and ,
College/School:College of Medical Veterinary and Life Sciences > School of Cardiovascular & Metabolic Health
Journal Name:Psychological Medicine
Publisher:Cambridge University Press
ISSN:0033-2917
ISSN (Online):1469-8978
Published Online:22 February 2021
Copyright Holders:Copyright © The Author(s) 2021
First Published:First published in Psychological Medicine 52: 3472–3483
Publisher Policy:Reproduced under a creative commons licence

University Staff: Request a correction | Enlighten Editors: Update this record