Using Entities in Knowledge Graph Hierarchies to Classify Sensitive Information

Frayling, E., Macdonald, C. , McDonald, G. and Ounis, I. (2022) Using Entities in Knowledge Graph Hierarchies to Classify Sensitive Information. In: 13th International Conference of the CLEF Association (CLEF 2022), Bologna, Italy, 5-8 Sept 2022, pp. 125-132. ISBN 9783031136429 (doi: 10.1007/978-3-031-13643-6_10)

[img] Text
273663.pdf - Accepted Version



Text classification has been shown to be effective for assisting human reviewers to identify sensitive information when reviewing documents to release to the public. However, automatically classifying sensitive information is difficult, since sensitivity is often due to contextual knowledge that must be inferred from the text. For example, the mention of a specific named entity is unlikely to provide enough context to automatically know if the information is sensitive. However, knowing the conceptual role of the entity, e.g. if the entity is a politician or a terrorist, can provide useful additional contextual information. Human sensitivity reviewers use their prior knowledge of such contextual information when making sensitivity judgements. However, statistical or contextualized classifiers cannot easily resolve these cases from the text alone. In this paper, we propose a feature extraction method that models entities in a hierarchical structure, based on the underlying structure of Wikipedia, to generate a more informative representation of entities and their roles. Our experiments, on a test collection containing real-world sensitivities, show that our proposed approach results in a significant improvement in sensitivity classification performance (2.2% BAC, McNemar’s Test, p < 0.05) compared to a text based sensitivity classifier.

Item Type:Conference Proceedings
Additional Information:E. Frayling, C. Macdonald and I. Ounis acknowledge the support of Innovate UK through a Knowledge Transfer Partnership (# 12040). All authors thank SVGC Ltd. for their support.
Glasgow Author(s) Enlighten ID:McDonald, Dr Graham and Frayling, Mr Erlend and Ounis, Professor Iadh and Macdonald, Professor Craig
Authors: Frayling, E., Macdonald, C., McDonald, G., and Ounis, I.
College/School:College of Science and Engineering > School of Computing Science
Published Online:25 August 2022
Copyright Holders:Copyright © 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
First Published:First published in Proceedings of the 13th International Conference of the CLEF Association (CLEF 2022) Bologna, Italy, September 5–8, 2022, LNCS 13390:125-132
Publisher Policy:Reproduced in accordance with the copyright policy of the publisher
Related URLs:

University Staff: Request a correction | Enlighten Editors: Update this record