Authorship attribution and pastiche

Somers, H., and Tweedie, F., (2003) Authorship attribution and pastiche. Computers and the Humanities, 37, pp. 407-429.

Full text not currently available from Enlighten.

Abstract

This paper considers the question of authorship attribution techniques whenfaced with a pastiche. We ask whether the techniques can distinguish the real thing from the fake, or can the author fool the computer? If the latter, is this because the pastiche is good, or because the technique is faulty? Using a number of mainly vocabulary-based techniques, Gilbert Adair's pastiche of Lewis Carroll, Alice Through the Needle's Eye, is compared with the original `Alice' books. Standard measures of lexical richness, Yule's K andOrlov's Z both distinguish Adair from Carroll, though Z also distinguishesthe two originals. A principal component analysis based on word frequenciesfinds that the main differences are not due to authorship. A discriminantanalysis based on word usage and lexical richness successfully distinguishes thepastiche from the originals. Weighted cusum tests were also unable to distinguish the two authors in a majority of cases. As a cross-validation, wemade similar comparisons with control texts: another children's story from thesame era, and other work by Carroll and Adair. The implications of thesefindings are discussed.

Item Type:Articles
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:UNSPECIFIED
Authors: Somers, H.,, and Tweedie, F.,
College/School:College of Science and Engineering > School of Mathematics and Statistics > Statistics
Journal Name:Computers and the Humanities
Publisher:Kluwer Academic Publishers
ISSN:0010-4817
ISSN (Online):1572-8412

University Staff: Request a correction | Enlighten Editors: Update this record