An Approach to Document Fingerprinting

Kim, Y. and Ross, S. (2015) An Approach to Document Fingerprinting. In: Allen, R., Hunter, J. and Zeng, M.L. (eds.) Digital Libraries: Providing Quality Information. Series: Lecture Notes in Computer Science (9469). Springer: Cham, pp. 107-119. ISBN 9783319279732 (doi: 10.1007/978-3-319-27974-9_11)

113792.pdf - Accepted Version



The nature of an individual document is often defined by its relationship to selected tasks, societal values, and cultural meaning. The identifying features, regardless of whether the document content is textual, aural or visual, are often delineated in terms of descriptions about the document, for example, intended audience, coverage of topics, purpose of creation, structure of presentation as well as relationships to other entities expressed by authorship, ownership, production process, and geographical and temporal markers. To secure a comprehensive view of a document, therefore, we must draw heavily on cognitive and/or computational resources not only to extract and classify information at multiple scales, but also to interlink these across multiple dimensions in parallel. Here we present a preliminary thought experiment for fingerprinting documents using textual documents visualised and analysed at multiple scales and dimensions to explore patterns on which we might capitalise.

Item Type:Book Sections
Keywords:Text analysis, natural language processing, patterns, readability.
Glasgow Author(s) Enlighten ID:Kim, Dr Yunhyong and Ross, Professor Seamus
Authors: Kim, Y., and Ross, S.
Subjects:Z Bibliography. Library Science. Information Resources > Z665 Library Science. Information Science
College/School:College of Arts & Humanities > School of Humanities > Information Studies
Copyright Holders:Copyright © 2015 SpringSpringer International Publishing
First Published:First published in Lecture Notes in Computer Science 9469:107-119
Publisher Policy:Reproduced in accordance with the copyright policy of the publisher.

University Staff: Request a correction | Enlighten Editors: Update this record

Project CodeAward NoProject NamePrincipal InvestigatorFunder's NameFunder RefLead Dept
559231BlogForeverSeamus RossEuropean Commission (EC)BlogForeverHU - ARTS AND MEDIA INFORMATICS (HATII)