Semi-parametric analysis of multi-rater data

Rogers, S. , Girolami, M. and Polajnar, T. (2009) Semi-parametric analysis of multi-rater data. Statistics and Computing, 20(3), pp. 317-334. (doi:10.1007/s11222-009-9125-z)

[img] Text


Publisher's URL:


Datasets that are subjectively labeled by a number of experts are becoming more common in tasks such as biological text annotation where class definitions are necessarily somewhat subjective. Standard classification and regression models are not suited to multiple labels and typically a pre-processing step (normally assigning the majority class) is performed. We propose Bayesian models for classification and ordinal regression that naturally incorporate multiple expert opinions in defining predictive distributions. The models make use of Gaussian process priors, resulting in great flexibility and particular suitability to text based problems where the number of covariates can be far greater than the number of data instances. We show that using all labels rather than just the majority improves performance on a recent biological dataset.

Item Type:Articles
Additional Information:The original publication is available at
Glasgow Author(s) Enlighten ID:Rogers, Dr Simon and Girolami, Prof Mark
Authors: Rogers, S., Girolami, M., and Polajnar, T.
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics
H Social Sciences > HA Statistics
College/School:College of Science and Engineering > School of Computing Science
Research Group:Inference
Journal Name:Statistics and Computing
ISSN (Online):1573-1375
Published Online:24 April 2009
Copyright Holders:Copyright © 2009 Springer
First Published:First published in Statistics and Computing
Publisher Policy:Reproduced in accordance with the copyright policy of the publisher.

University Staff: Request a correction | Enlighten Editors: Update this record

Project CodeAward NoProject NamePrincipal InvestigatorFunder's NameFunder RefLead Dept
399341Stochastic modelling and statistical inference of gene regulatory pathways - integrating multiple sources of dataErnst WitEngineering & Physical Sciences Research Council (EPSRC)EP/C010620/1Statistics