Sensitivity Review of Large Collections by Identifying and Prioritising Coherent Documents Groups

Narvala, H. , McDonald, G. and Ounis, I. (2022) Sensitivity Review of Large Collections by Identifying and Prioritising Coherent Documents Groups. In: Proceedings of the 31st ACM International Conference on Information and Knowledge Management (CIKM ’22), Atlanta, GA, USA, 17-21 October 2022, pp. 4931-4935. ISBN 9781450392365 (doi: 10.1145/3511808.3557182)

[img] Text
276984.pdf - Accepted Version

1MB

Abstract

With the massive increase in the volume of digitally produced documents, government departments face a logistical issue when conducting the manual sensitivity review of documents that should be opened to the public. When reviewing a document, sensitivity reviewers often need to quickly access related information from other documents in the collection. For example, documents that mention the same topic or event can provide the reviewers with useful contextual information and assist the reviewers to make consistent sensitivity judgements more quickly. However, it is infeasible to manually identify groups of such related documents in large unstructured collections. In this work, we present a sensitivity review system that automatically identifies groups of related documents to assist reviewers and increase the efficiency of sensitivity review. In particular, our system groups the documents that are to be sensitivity reviewed based on the documents' semantic categories (e.g., criminality). Moreover, the system identifies chronological and coherent information threads to describe the full context of an event, activity or discussion that may be spread across multiple documents. Additionally, the system prioritises the identified semantic categories and information threads for review by leveraging automatic sensitivity classification to maximise the number of documents that can be opened to the public in a limited reviewing time-budget.

Item Type:Conference Proceedings
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Ounis, Professor Iadh and McDonald, Dr Graham and Narvala, Hitarth
Authors: Narvala, H., McDonald, G., and Ounis, I.
College/School:College of Science and Engineering > School of Computing Science
ISBN:9781450392365
Copyright Holders:© 2022 Copyright held by the owner/author(s
First Published:First published in CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management: 4931-4935
Publisher Policy:Reproduced in accordance with the publisher copyright policy
Related URLs:

University Staff: Request a correction | Enlighten Editors: Update this record