Salamin, H., Vinciarelli, A. , Truong, K. and Mohammadi, G. (2010) Automatic role recognition based on conversational and prosodic behaviour. In: International conference on Multimedia (MM '10), Firenez, Italy, 25-29 Oct 2010, pp. 847-850. (doi: 10.1145/1873951.1874094)
Full text not currently available from Enlighten.
Publisher's URL: http://dx.doi.org/10.1145/1873951.1874094
Abstract
This paper proposes an approach for the automatic recognition of roles in settings like news and talk-shows, where roles correspond to specific functions like Anchorman, Guest or Interview Participant. The approach is based on purely nonverbal vocal behavioral cues, including who talks when and how much (turn-taking behavior), and statistical properties of pitch, formants, energy and speaking rate (prosodic behavior). The experiments have been performed over a corpus of around 50 hours of broadcast material and the accuracy, percentage of time correctly labeled in terms of role, is up to 89%. Both turn-taking and prosodic behavior lead to satisfactory results. Furthermore, on one database, their combination leads to a statistically significant improvement.
Item Type: | Conference Proceedings |
---|---|
Status: | Published |
Refereed: | Yes |
Glasgow Author(s) Enlighten ID: | Vinciarelli, Professor Alessandro and Salamin, Mr Hugues |
Authors: | Salamin, H., Vinciarelli, A., Truong, K., and Mohammadi, G. |
College/School: | College of Science and Engineering > School of Computing Science |
University Staff: Request a correction | Enlighten Editors: Update this record