Topics in Tweets: A User Study of Topic Coherence Metrics for Twitter Data

Fang, A., Macdonald, C. , Ounis, I. and Habel, P. (2016) Topics in Tweets: A User Study of Topic Coherence Metrics for Twitter Data. In: ECIR 2016: 38th European Conference on Information Retrieval, Padua, Italy, 21-23 March 2016, pp. 492-504. ISBN 9783319306704 (doi: 10.1007/978-3-319-30671-1_36)

116772.pdf - Accepted Version



Twitter offers scholars new ways to understand the dynamics of public opinion and social discussions. However, in order to understand such discussions, it is necessary to identify coherent topics that have been discussed in the tweets. To assess the coherence of topics, several auto- matic topic coherence metrics have been designed for classical document corpora. However, it is unclear how suitable these metrics are for topic models generated from Twitter datasets. In this paper, we use crowd- sourcing to obtain pairwise user preferences of topical coherences and to determine how closely each of the metrics align with human preferences. Moreover, we propose two new automatic coherence metrics that use Twitter as a separate background dataset to measure the coherence of topics. We show that our proposed Pointwise Mutual Information-based metric provides the highest levels of agreement with human preferences of topic coherence over two Twitter datasets.

Item Type:Conference Proceedings
Glasgow Author(s) Enlighten ID:Macdonald, Professor Craig and Habel, Dr Philip and Ounis, Professor Iadh
Authors: Fang, A., Macdonald, C., Ounis, I., and Habel, P.
College/School:College of Science and Engineering > School of Computing Science
College of Social Sciences > School of Social and Political Sciences > Politics
Copyright Holders:Copyright © 2016 Springer
First Published:First published in Lecture Notes in Computer Science 9626:492-504
Publisher Policy:Reproduced in accordance with the copyright policy of the publisher
Related URLs:

University Staff: Request a correction | Enlighten Editors: Update this record