Polajnar, T., Rogers, S. and Girolami, M. (2009) Classification of protein interaction sentences via gaussian processes. Lecture Notes in Computer Science, 5780, pp. 282-292. (doi: 10.1007/978-3-642-04031-3_25)
Text
6452.pdf 232kB |
Publisher's URL: http://dx.doi.org/10.1007/978-3-642-04031-3_25
Abstract
The increase in the availability of protein interaction studies in textual format coupled with the demand for easier access to the key results has lead to a need for text mining solutions. In the text processing pipeline, classification is a key step for extraction of small sections of relevant text. Consequently, for the task of locating protein-protein interaction sentences, we examine the use of a classifier which has rarely been applied to text, the Gaussian processes (GPs). GPs are a non-parametric probabilistic analogue to the more popular support vector machines (SVMs). We find that GPs outperform the SVM and na\"ive Bayes classifiers on binary sentence data, whilst showing equivalent performance on abstract and multiclass sentence corpora. In addition, the lack of the margin parameter, which requires costly tuning, along with the principled multiclass extensions enabled by the probabilistic framework make GPs an appealing alternative worth of further adoption.
Item Type: | Articles |
---|---|
Keywords: | Gaussian processes; support vector machines; protein interaction; text mining; bioinformatics |
Status: | Published |
Refereed: | Yes |
Glasgow Author(s) Enlighten ID: | Rogers, Dr Simon and Girolami, Prof Mark |
Authors: | Polajnar, T., Rogers, S., and Girolami, M. |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
College/School: | College of Science and Engineering > School of Computing Science |
Journal Name: | Lecture Notes in Computer Science |
Publisher: | Springer Berlin / Heidelberg |
ISSN: | 0302-9743 |
ISSN (Online): | 1611-3349 |
Published Online: | 31 August 2009 |
Copyright Holders: | Copyright © 2009 Springer |
First Published: | First published in Lecture Notes in Computer Science 5780:282-292 |
Publisher Policy: | Reproduced in accordance with the copyright policy of the publisher. |
University Staff: Request a correction | Enlighten Editors: Update this record