Wang, X., Fang, A., Ounis, I. and Macdonald, C. (2019) Evaluating Similarity Metrics for Latent Twitter Topics. In: 41st European Conference on Information Retrieval (ECIR 2019), Cologne, Germany, 14-18 Apr 2019, pp. 787-794. ISBN 9783030157128 (doi: 10.1007/978-3-030-15712-8_54)
|
Text
174725.pdf - Accepted Version 322kB |
Abstract
Topic modelling approaches such as LDA, when applied on a tweet corpus, can often generate a topic model containing redundant topics. To evaluate the quality of a topic model in terms of redundancy, topic similarity metrics can be applied to estimate the similarity among topics in a topic model. There are various topic similarity metrics in the literature, e.g. the Jensen Shannon (JS) divergence-based metric. In this paper, we evaluate the performances of four distance/divergence-based topic similarity metrics and examine how they align with human judgements, including a newly proposed similarity metric that is based on computing word semantic similarity using word embeddings (WE). To obtain human judgements, we conduct a user study through crowdsourcing. Among various insights, our study shows that in general the cosine similarity (CS) and WE-based metrics perform better and appear to be complementary. However, we also find that the human assessors cannot easily distinguish between the distance/divergence-based and the semantic similarity-based metrics when identifying similar latent Twitter topics.
Item Type: | Conference Proceedings |
---|---|
Status: | Published |
Refereed: | Yes |
Glasgow Author(s) Enlighten ID: | Macdonald, Professor Craig and Wang, Xi and Fang, Mr Anjie and Ounis, Professor Iadh |
Authors: | Wang, X., Fang, A., Ounis, I., and Macdonald, C. |
College/School: | College of Science and Engineering College of Science and Engineering > School of Computing Science |
ISBN: | 9783030157128 |
Copyright Holders: | Copyright © 2019 Springer Nature Switzerland AG |
Publisher Policy: | Reproduced in accordance with the copyright policy of the publisher |
Related URLs: |
University Staff: Request a correction | Enlighten Editors: Update this record