Genre classification in automated ingest and appraisal metadata.

Kim, Y. and Ross, S. (2006) Genre classification in automated ingest and appraisal metadata. In: European Conf. on advanced technology and research in Digital Libraries, Alicante, Spain, 17-22 September 2006., pp. 63-74. ISBN 9783540446361 (doi: 10.1007/11863878_6)

[img] Text
4735.pdf

268kB

Publisher's URL: http://dx.doi.org/10.1007/11863878_6

Abstract

Metadata creation is a crucial aspect of the ingest of digital materials into digital libraries. Metadata needed to document and manage digital materials are extensive and manual creation of them expensive. The Digital Curation Centre (DCC) has undertaken research to automate this process for some classes of digital material. We have segmented the problem and this paper discusses results in genre classification as a first step toward automating metadata extraction from documents. Here we propose a classification method built on looking at the documents from five directions; as an object exhibiting a specific visual format, as a linear layout of strings with characteristic grammar, as an object with stylo-metric signatures, as an object with intended meaning and purpose, and as an object linked to previously classified objects and other external sources. The results of some experiments in relation to the first two directions are described here; they are meant to be indicative of the promise underlying this multi-facetted approach.

Item Type:Conference Proceedings
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Kim, Dr Yunhyong and Ross, Professor Seamus
Authors: Kim, Y., and Ross, S.
Subjects:Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4050 Electronic information resources
College/School:College of Arts & Humanities > School of Humanities > Information Studies
Research Group:Digital Curation Centre
Publisher:Springer
ISSN:1611-3349
ISBN:9783540446361
Copyright Holders:Copyright © 2006 Springer
First Published:First published in Lecture Notes in Computer Science 4172:63-74
Publisher Policy:Reproduced in accordance with the copyright policy of the publisher

University Staff: Request a correction | Enlighten Editors: Update this record