Automated Crisis Content Categorization for COVID-19 Tweet Streams

Long, Z. and McCreadie, R. (2021) Automated Crisis Content Categorization for COVID-19 Tweet Streams. In: 18th International Conference on Information Systems for Crisis Response and Management, Blacksburg, VA, USA, 23-26 May 2021, pp. 667-678.

[img] Text
252696.pdf - Published Version
Available under License Creative Commons Attribution.

1MB

Abstract

Social media platforms, like Twitter, are increasingly used by billions of people internationally to share information. As such, these platforms contain vast volumes of real-time multimedia content about the world, which could be invaluable for a range of tasks such as incident tracking, damage estimation during disasters, insurance risk estimation, and more. By mining this real-time data, there are substantial economic benefits, as well as opportunities to save lives. Currently, the COVID-19 pandemic is attacking societies at an unprecedented speed and scale, forming an important use-case for social media analysis. However, the amount of information during such crisis events is vast and information normally exists in unstructured and multiple formats, making manual analysis very time consuming. Hence, in this paper, we examine how to extract valuable information from tweets related to COVID-19 automatically. For 12 geographical locations, we experiment with supervised approaches for labelling tweets into 7 crisis categories, as well as investigated automatic priority estimation, using both classical and deep learned approaches. Through evaluation using the TREC-IS 2020 COVID-19 datasets, we demonstrated that effective automatic labelling for this task is possible with an average of 61% F1 performance across crisis categories, while also analysing key factors that affect model performance and model generalizability across locations.

Item Type:Conference Proceedings
Keywords:COVID-19 tweets classification, crisis management, deep learning, BERT, supervised learning.
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:LONG, ZIJUN and Mccreadie, Dr Richard
Authors: Long, Z., and McCreadie, R.
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
College/School:College of Science and Engineering > School of Computing Science
Research Group:Information Retrieval
ISSN:9781949373615
Published Online:23 May 2021
Copyright Holders:Copyright © 2021 The Authors
First Published:First published in Proceedings of the 18th ISCRAM Conference: 667-678
Publisher Policy:Reproduced under a Creative Commons License
Related URLs:

University Staff: Request a correction | Enlighten Editors: Update this record