Predictive response-relevant clustering of expression data provides insights into disease processes

Hopcroft, L. E.M. , McBride, M. W. , Harris, K. J., Sampson, A. K., McClure, J. D. , Graham, D. , Young, G., Holyoake, T. L., Girolami, M. A. and Dominiczak, A. F. (2010) Predictive response-relevant clustering of expression data provides insights into disease processes. Nucleic Acids Research, 38(20), pp. 6831-6840. (doi:10.1093/nar/gkq550)

[img]
Preview
Text
44049.pdf

1MB

Publisher's URL: http://dx.doi.org/10.1093/nar/gkq550

Abstract

This article describes and illustrates a novel method of microarray data analysis that couples model-based clustering and binary classification to form clusters of ;response-relevant' genes; that is, genes that are informative when discriminating between the different values of the response. Predictions are subsequently made using an appropriate statistical summary of each gene cluster, which we call the ;meta-covariate' representation of the cluster, in a probit regression model. We first illustrate this method by analysing a leukaemia expression dataset, before focusing closely on the meta-covariate analysis of a renal gene expression dataset in a rat model of salt-sensitive hypertension. We explore the biological insights provided by our analysis of these data. In particular, we identify a highly influential cluster of 13 genes-including three transcription factors (Arntl, Bhlhe41 and Npas2)-that is implicated as being protective against hypertension in response to increased dietary sodium. Functional and canonical pathway analysis of this cluster using Ingenuity Pathway Analysis implicated transcriptional activation and circadian rhythm signalling, respectively. Although we illustrate our method using only expression data, the method is applicable to any high-dimensional datasets.

Item Type:Articles
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Sampson, Dr Amanda and Graham, Dr Delyth and Holyoake, Professor Tessa and McBride, Dr Martin and Dominiczak, Professor Anna and Girolami, Prof Mark and Hopcroft, Dr Lisa and Young, Prof Graham and McClure, Dr John and Harris, Dr Keith
Authors: Hopcroft, L. E.M., McBride, M. W., Harris, K. J., Sampson, A. K., McClure, J. D., Graham, D., Young, G., Holyoake, T. L., Girolami, M. A., and Dominiczak, A. F.
College/School:College of Medical Veterinary and Life Sciences > Institute of Cancer Sciences
College of Medical Veterinary and Life Sciences > Institute of Cardiovascular and Medical Sciences
College of Science and Engineering > School of Engineering > Infrastructure and Environment
College of Science and Engineering > School of Mathematics and Statistics > Mathematics
Journal Name:Nucleic Acids Research
Publisher:Oxford University Press
ISSN:0305-1048
ISSN (Online):1362-4962
Published Online:22 June 2010
Copyright Holders:Copyright © 2010 The Authors
First Published:First published in Nucleic Acids Research 38(20):6831-6840
Publisher Policy:Reproduced in accordance with the copyright policy of the publisher
Related URLs:

University Staff: Request a correction | Enlighten Editors: Update this record

Project CodeAward NoProject NamePrincipal InvestigatorFunder's NameFunder RefLead Dept
396841Probabilistic Reconstruction of Signalling Pathways & Identification of Novel Transcription Factors Employing Heterogeneous Genome-Wide dataMark GirolamiMedical Research Council (MRC)G0401466Computing Science