A Bayesian hidden Markov model for motif discovery through joint modeling of genomic sequence and ChIP-chip data

Gelfond, J.A.L., Gupta, M. and Ibrahim, J.G. (2009) A Bayesian hidden Markov model for motif discovery through joint modeling of genomic sequence and ChIP-chip data. Biometrics, 65(4), pp. 1087-1095. (doi: 10.1111/j.1541-0420.2008.01180.x)

Full text not currently available from Enlighten.

Abstract

We propose a unified framework for the analysis of chromatin (Ch) immunoprecipitation (IP) microarray (ChIP-chip) data for detecting transcription factor binding sites (TFBSs) or motifs. ChIP-chip assays are used to focus the genome-wide search for TFBSs by isolating a sample of DNA fragments with TFBSs and applying this sample to a microarray with probes corresponding to tiled segments across the genome. Present analytical methods use a two-step approach: (i) analyze array data to estimate IP-enrichment peaks then (ii) analyze the corresponding sequences independently of intensity information. The proposed model integrates peak finding and motif discovery through a unified Bayesian hidden Markov model (HMM) framework that accommodates the inherent uncertainty in both measurements. A Markov chain Monte Carlo algorithm is formulated for parameter estimation, adapting recursive techniques used for HMMs. In simulations and applications to a yeast RAP1 dataset, the proposed method has favorable TFBS discovery performance compared to currently available two-stage procedures in terms of both sensitivity and specificity.

Item Type:Articles
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Gupta, Professor Mayetri
Authors: Gelfond, J.A.L., Gupta, M., and Ibrahim, J.G.
College/School:College of Science and Engineering > School of Mathematics and Statistics > Statistics
Journal Name:Biometrics
ISSN:0006-341X
ISSN (Online):1541-0420
Published Online:05 February 2009

University Staff: Request a correction | Enlighten Editors: Update this record