Kim, Y. , Ross, S. and Berninger, V. (2008) KRYS I Corpus. [Website]
Full text not currently available from Enlighten.
Abstract
The KRYS I corpus is a collection of over 6300 documents labelled with their genre classes. It was constructed as part of a research initiative to automate document genre classification driven by the Digital Curation Centre. It was carried out at the Humanities Advanced Technology and Information Institute (HATII), University of Glasgow between 2005 and 2008. The notion of genre is deeply embedded in the way humans organise information. Identifying the genre of a document helps to characterise the physical and conceptual structure of the text, helping to capture the style and location of further information within the text. There have been very few genre-labelled corpora available to the research community. Our corpus is made available here to fill this gap and serve as a valuable resource for researchers in: metadata extraction, digital curation, text classification, text mining, computational linguistics, and, pattern recognition.
Item Type: | Website |
---|---|
Status: | Published |
Glasgow Author(s) Enlighten ID: | Kim, Dr Yunhyong and Ross, Professor Seamus |
Authors: | Kim, Y., Ross, S., and Berninger, V. |
Subjects: | Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4050 Electronic information resources |
College/School: | College of Arts & Humanities > School of Humanities > Information Studies |
University Staff: Request a correction | Enlighten Editors: Update this record