Towards Maximising Openness in Digital Sensitivity Review using Reviewing Time Predictions

McDonald, G. , Macdonald, C. and Ounis, I. (2018) Towards Maximising Openness in Digital Sensitivity Review using Reviewing Time Predictions. In: 40th European Conference on Information Retrieval (ECIR 2018), Grenoble, France, 25-29 Mar 2018, pp. 699-706. ISBN 9783319769400 (doi: 10.1007/978-3-319-76941-7_65)

154198.pdf - Accepted Version



The adoption of born-digital documents, such as email, by governments, such as in the UK and USA, has resulted in a large backlog of born-digital documents that must be sensitivity reviewed before they can be opened to the public, to ensure that no sensitive information is released, e.g. personal or confidential information. However, it is not practical to review all of the backlog with the available reviewing resources and, therefore, there is a need for automatic techniques to increase the number of documents that can be opened within a fixed reviewing time budget. In this paper, we conduct a user study and use the log data to build models to predict reviewing times for an average sensitivity reviewer. Moreover, we show that using our reviewing time predictions to select the order that documents are reviewed can markedly increase the ratio of reviewed documents that are released to the public, e.g. +30% for collections with high levels of sensitivity, compared to reviewing by shortest document first. This, in turn, increases the total number of documents that are opened to the public within a fixed reviewing time budget, e.g. an extra 200 documents in 100 hours reviewing.

Item Type:Conference Proceedings
Glasgow Author(s) Enlighten ID:Macdonald, Professor Craig and McDonald, Dr Graham and Ounis, Professor Iadh
Authors: McDonald, G., Macdonald, C., and Ounis, I.
College/School:College of Science and Engineering > School of Computing Science
Published Online:01 March 2018
Copyright Holders:Copyright © 2018 Springer International Publishing AG, part of Springer Nature
First Published:First published in Advances in Information Retrieval. ECIR 2018: 699-706
Publisher Policy:Reproduced in accordance with the publisher copyright policy

University Staff: Request a correction | Enlighten Editors: Update this record