Artificial Intelligence in Medicine
Volume 35, Issue 1 , Pages 9-18 , September 2005

Finding the biologically optimal alignment of multiple sequences

Received 8 November 2004 ,Revised 27 December 2004 ,Accepted 12 January 2005.

References 

  1. Needleman S, Wunsch C. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol. 1970;48:443–453
  2. Li M, Ma B, Wang L. Near optimal multiple alignment within a band in polynomial time. In:  Yao F,  Luks E editor. Proceedings of the 32nd annual ACM symposium on theory of computing (STOC 2000). New York: ACM Press; 2000;p. 425–34
  3. Nicholas H, Ropelewski A, Deerfield D. Strategies for multiple sequence alignment. Biotechniques. 2002;32:572–574
  4. Notredame C. Recent progress in multiple sequence alignement. Pharmacogenomics. 2002;3:131–144
  5. Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucl Acids Res. 1994;22(22):4673–4680
  6. Durbin R, Eddy S, Krogh A, Mitchison G. Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge, UK: Cambridge University Press; 1998;
  7. Edgar R. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucl Acids Res. 2004;32:1792–1794
  8. Bateman A, Coin L, Durbin R, Finn R, Hollich V, Griffiths-Jones S,, et al. The pfam protein families database. Nucl Acids Res. 2004;32:D138–D141
  9. Sjölander K, Karplus K, Brown M, Hughey R, Krogh A, Mian S,, et al. Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology. Bioinformatics. 1995;12:327–345
  10. Geman S, Geman D. Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Trans Pattern Anal Mach Intell. 1984;6(6):721–742
  11. Gamerman D. Markov chain Monte Carlo: stochastic simulation for Bayesian inference. Chapman & Hall; 1997;
  12. Kass R, Carlin B, Gelman A, Neal R. Markov chain Monte Carlo in practice: a roundtable discussion. Am Statistician. 1998;52:93–100
  13. Eddy S. Multiple alignment using hidden Markov models. In:  Clark D,  Altman R,  Hunter L,  Lengauer T,  Wodak S,  Rawlings C editor. Proceedings of the third international conference on intelligent systems for molecular biology (ISMB-95). Menlo Park, CA: AAAI Press; 1995;p. 114–20
  14. Thompson W, Rouchka E, Lawrence C. Gibbs recursive sampler: finding transcription factor binding sites. Nucl Acids Res. 2003;31:3580–3585
  15. Rose K. Deterministic annealing for clustering, compression, classification, regression, and related optimization problems. Proc IEEE. 1998;86:2210–2239
  16. Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm (with discussion). J R Statistical Soc. 1977;39:1–38Series B
  17. Heger A, Holm L. Exhaustive enumeration of protein domain families. J Mol Biol. 2003;328:749–767
  18. Evens W, Grant G. Statistical methods in bioinformatics. New York: Springer-Verlag; 2001;
  19. Dayhoff MO, Schwartz RM, Orcutt BC. A model of evolutionary change in proteins. In:  Dayhoff MO editors. Atlas of protein sequence and structure, vol. 5. Washington: National Biomedical Research Foundation; 1978;p. 345–52
  20. Ueda N, Nakano R. Deterministic annealing EM algorithm. Neural Networks. 1998;11:271–282

PII: S0933-3657(05)00062-X

doi: 10.1016/j.artmed.2005.01.007

Artificial Intelligence in Medicine
Volume 35, Issue 1 , Pages 9-18 , September 2005