Multi-Local Attention for Speech-Based Depression Detection

Tao, F., Ge, X., Ma, W., Esposito, A. and Vinciarelli, A. (2023) Multi-Local Attention for Speech-Based Depression Detection. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023), Rhodes, Greece, 4-10 June 2023, ISBN 9781728163277 (doi: 10.1109/ICASSP49357.2023.10095757)

[img] Text
298425.pdf - Accepted Version

1MB

Abstract

This article shows that an attention mechanism, the Multi-Local Attention, can improve a depression detection approach based on Long Short-Term Memory Networks. Besides leading to higher performance metrics (e.g., Accuracy and F1 Score), Multi-Local Attention improves two other aspects of the approach, both important from an application point of view. The first is the effectiveness of a confidence score associated to the detection outcome at identifying speakers more likely to be classified correctly. The second is the amount of speaking time needed to classify a speaker as depressed or non-depressed. The experiments were performed over read speech and involved 109 participants (including 55 diagnosed with depression by professional psychiatrists). The results show accuracies up to 88.0% (F1 Score 88.0%).

Item Type:Conference Proceedings
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Tao, Mr Fuxiang and Ge, Ms Xuri and ma, wei and Vinciarelli, Professor Alessandro
Authors: Tao, F., Ge, X., Ma, W., Esposito, A., and Vinciarelli, A.
College/School:College of Science and Engineering > School of Computing Science
ISBN:9781728163277
Published Online:05 May 2023
Copyright Holders:Copyright © 2023 IEEE
First Published:First published in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023)
Publisher Policy:Reproduced in accordance with the publisher copyright policy

University Staff: Request a correction | Enlighten Editors: Update this record