A Semantic Graph-Based Approach for Mining Common Topics From Multiple Asynchronous Text Streams

Chen, L., Jose, J. M. , Yu, H. and Yuan, F. (2017) A Semantic Graph-Based Approach for Mining Common Topics From Multiple Asynchronous Text Streams. In: 26th International World Wide Web Conference: WWW 2017, Perth, Australia, 3-7 Apr 2017, pp. 1201-1209. ISBN 9781450349130 (doi:10.1145/3038912.3052630)

[img]
Preview
Text
133163.pdf - Published Version
Available under License Creative Commons Attribution.

1MB

Abstract

In the age of Web 2.0, a substantial amount of unstructured content are distributed through multiple text streams in an asynchronous fashion, which makes it increasingly difficult to glean and distill useful information. An effective way to explore the information in text streams is topic modelling, which can further facilitate other applications such as search, information browsing, and pattern mining. In this paper, we propose a semantic graph based topic modelling approach for structuring asynchronous text streams. Our model in- tegrates topic mining and time synchronization, two core modules for addressing the problem, into a unified model. Specifically, for handling the lexical gap issues, we use global semantic graphs of each timestamp for capturing the hid- den interaction among entities from all the text streams. For dealing with the sources asynchronism problem, local semantic graphs are employed to discover similar topics of different entities that can be potentially separated by time gaps. Our experiment on two real-world datasets shows that the proposed model significantly outperforms the existing ones.

Item Type:Conference Proceedings
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Jose, Professor Joemon and Chen, Dr Long and Yu, Dr Haitao
Authors: Chen, L., Jose, J. M., Yu, H., and Yuan, F.
College/School:College of Science and Engineering > School of Computing Science
ISBN:9781450349130
Copyright Holders:Copyright © 2017 International World Wide Web Conference Committee (IW3C2)
Publisher Policy:Reproduced under a Creative Commons License
Related URLs:

University Staff: Request a correction | Enlighten Editors: Update this record

Project CodeAward NoProject NamePrincipal InvestigatorFunder's NameFunder RefLead Dept
643481A Situation-aware information infrastructureDimitrios PezarosEngineering & Physical Sciences Research Council (EPSRC)EP/L026015/1COM - COMPUTING SCIENCE