Detecting sporadic recombination in DNA alignments with hidden Markov models

Husmeier, D. and Wright, F. (2000) Detecting sporadic recombination in DNA alignments with hidden Markov models. In: Bornberg-Bauer, E., Rost, U., Stoye, J. and Vingron, M. (eds.) GCB 2000: Proceedings of the German Conference on Bioinformatics. Logos Verlag: Berlin, Germany, pp. 19-26. ISBN 9783897224988

Full text not currently available from Enlighten.

Publisher's URL: http://www.informatik.uni-trier.de/~ley/db/conf/gcb/gcb2000.html

Abstract

Conventional phylogenetic tree estimation methods assume that all sites in a DNA multiple alignment have the same evolutionary history. This assumption is violated in data sets from certain bacteria and viruses due to recombination, a process that leads to the creation of mosaic sequences from different strains and, if undetected, causes systematic errors in phylogenetic tree estimation. In the current work, a hidden Markov model (HMM) is employed to detect recombination events in multiple DNA sequence alignments. The emission probabilities in a given state are determined by the branching order (topology) and the branch lengths of the respective phylogenetic tree, while the transition probabilities depend on the global frequency of recombination. All model parameters are optimized in a maximum likelihood sense with the expectation maximization (EM) algorithm. The resulting parameter optimization scheme is applied to a synthetic benchmark problem and to real DNA sequences from the argF gene of four strains of the bacterium Neisseria. In both cases we find a significant im- provement over an earlier heuristic parameter estimation approach.

Item Type:Book Sections
Status:Published
Glasgow Author(s) Enlighten ID:Husmeier, Professor Dirk
Authors: Husmeier, D., and Wright, F.
College/School:College of Science and Engineering > School of Mathematics and Statistics > Statistics
Publisher:Logos Verlag
ISBN:9783897224988

University Staff: Request a correction | Enlighten Editors: Update this record