McDonald, G. , Macdonald, C. and Ounis, I. (2020) Active Learning Stopping Strategies for Technology-Assisted Sensitivity Review. In: 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2020), Xi'an, China, 25-30 Jul 2020, pp. 2053-2056. ISBN 9781450380164 (doi: 10.1145/3397271.3401267)
|
Text
215380.pdf - Accepted Version 829kB |
Abstract
Active learning strategies are often deployed in technology-assisted review tasks, such as e-discovery and sensitivity review, to learn a classifier that can assist the reviewers with their task. In particular, an active learning strategy selects the documents that are expected to be the most useful for learning an effective classifier, so that these documents can be reviewed before the less useful ones. However, when reviewing for sensitivity, the order in which the documents are reviewed can impact on the reviewers' ability to perform the review. Therefore, when deploying active learning in technology-assisted sensitivity review, we want to know when a sufficiently effective classifier has been learned, such that the active learning can stop and the reviewing order of the documents can be selected by the reviewer instead of the classifier. In this work, we propose two active learning stopping strategies for technology-assisted sensitivity review. We evaluate the effectiveness of our proposed approaches in comparison with three state-of-the-art stopping strategies from the literature. We show that our best performing approach results in a significantly more effective sensitivity classifier (+6.6% F2) than the best performing stopping strategy from the literature (McNemar's test, p<0.05).
Item Type: | Conference Proceedings |
---|---|
Status: | Published |
Refereed: | Yes |
Glasgow Author(s) Enlighten ID: | Macdonald, Professor Craig and McDonald, Dr Graham and Ounis, Professor Iadh |
Authors: | McDonald, G., Macdonald, C., and Ounis, I. |
College/School: | College of Science and Engineering > School of Computing Science |
ISBN: | 9781450380164 |
Published Online: | 25 July 2020 |
Copyright Holders: | Copyright © 2020 Association for Computing Machinery |
First Published: | First published in SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval: 2053-2056 |
Publisher Policy: | Reproduced in accordance with the publisher copyright policy |
Related URLs: |
University Staff: Request a correction | Enlighten Editors: Update this record