Husmeier, D. and Wright, F. (2001) Detection of recombination in DNA multiple alignments with hidden markov models. Journal of Computational Biology, 8(4), pp. 401-427. (doi: 10.1089/106652701752236214)
|
Text
85651.pdf - Published Version 634kB |
Abstract
CConventional phylogenetic tree estimation methods assume that all sites in a DNA multiple alignment have the same evolutionary history. This assumption is violated in data sets from certain bacteria and viruses due to recombination, a process that leads to the creation of mosaic sequences from different strains and, if undetected, causes systematic errors in phylogenetic tree estimation. In the current work, a hidden Markov model (HMM) is employed to detect recombination events in multiple alignments of DNA sequences. The emission probabilities in a given state are determined by the branching order (topology) and the branch lengths of the respective phylogenetic tree, while the transition probabilities depend on the global recombination probability. The present study improves on an earlier heuristic parameter optimization scheme and shows how the branch lengths and the recombination probability can be optimized in a maximum likelihood sense by applying the expectation maximization (EM) algorithm. The novel algorithm is tested on a synthetic benchmark problem and is found to clearly outperform the earlier heuristic approach. The paper concludes with an application of this scheme to a DNA sequence alignment of the argF gene from four Neisseria strains, where a likely recombination event is clearly detected.
Item Type: | Articles |
---|---|
Additional Information: | This is a copy of an article published in the Journal of Computational Biology © 2001 Mary Ann Liebert, Inc.; Journal of Computational Biology is available online at: http://online.liebertpub.com. |
Status: | Published |
Refereed: | Yes |
Glasgow Author(s) Enlighten ID: | Husmeier, Professor Dirk |
Authors: | Husmeier, D., and Wright, F. |
College/School: | College of Science and Engineering > School of Mathematics and Statistics > Statistics |
Journal Name: | Journal of Computational Biology |
Publisher: | Mary Ann Liebert |
ISSN: | 1066-5277 |
ISSN (Online): | 1557-8666 |
Copyright Holders: | Copyright © 2001 Mary Ann Liebert, Inc. |
First Published: | First published in Journal of Computational Biology 8(4):401-427 |
Publisher Policy: | Reproduced in accordance with the copyright policy of the publisher |
University Staff: Request a correction | Enlighten Editors: Update this record