Automatic role recognition based on conversational and prosodic behaviour

Salamin, H., Vinciarelli, A. , Truong, K. and Mohammadi, G. (2010) Automatic role recognition based on conversational and prosodic behaviour. In: International conference on Multimedia (MM '10), Firenez, Italy, 25-29 Oct 2010, pp. 847-850. (doi: 10.1145/1873951.1874094)

Full text not currently available from Enlighten.

Publisher's URL:


This paper proposes an approach for the automatic recognition of roles in settings like news and talk-shows, where roles correspond to specific functions like Anchorman, Guest or Interview Participant. The approach is based on purely nonverbal vocal behavioral cues, including who talks when and how much (turn-taking behavior), and statistical properties of pitch, formants, energy and speaking rate (prosodic behavior). The experiments have been performed over a corpus of around 50 hours of broadcast material and the accuracy, percentage of time correctly labeled in terms of role, is up to 89%. Both turn-taking and prosodic behavior lead to satisfactory results. Furthermore, on one database, their combination leads to a statistically significant improvement.

Item Type:Conference Proceedings
Glasgow Author(s) Enlighten ID:Vinciarelli, Professor Alessandro and Salamin, Mr Hugues
Authors: Salamin, H., Vinciarelli, A., Truong, K., and Mohammadi, G.
College/School:College of Science and Engineering > School of Computing Science

University Staff: Request a correction | Enlighten Editors: Update this record