Artificial Intelligence in Medicine
Volume 35, Issue 1 , Pages 107-119 , September 2005

Computational modeling of oligonucleotide positional densities for human promoter prediction

Received 30 October 2004 ,Revised 31 January 2005 ,Accepted 22 February 2005.

References 

  1. Fickett JW, Hatzigeorgiou AG. Eukaryotic promoter recognition—review. Genome Res. 1997;7:861–878
  2. Werner T. The state of art of mammalian promoter recognition. Briefings Bioinformat. 2003;4:22–30
  3. Latchman DS. Eukaryotic transcription factors. 4th ed.. London: Academic Press; 2003;
  4. Kondrakhin YV, Kel AE, Kolchanov NA, Romaschenko AG, Milanesi L. Eukaryotic promoter recognition by binding sites for transcription factors. Comput Appl Biosci. 1995;11:477–488
  5. Prestridge DS. Predicting pol II promoter sequences using transcription factor binding sites. J Mol Biol. 1995;249:923–932
  6. Down TA, Hubbard TJP. Computational detection and location of transcription start sites in mammalian genomic DNA. Genome Res. 2002;12:458–461
  7. Bucher P. Weight matrix descriptions of four eukaryotic RNA polymerase II promoter elements derived from 502 unrelated promoter sequences. J Mol Biol. 1990;212:563–578
  8. Kel-Margoulis O, Kel AE, Reuter I, Deineko IV, Wingender E. TRANSCompel - a database on composite regulatory elements in eukaryotic genes. Nucleic Acids Res. 2002;30:332–334
  9. Werner T. Models for prediction and recognition of eukaryotic promoters. Mammalian Genome. 1999;10:168–175
  10. Hutchinson GB. The prediction of vertebrate promoter regions using differential hexamer frequency analysis. Comp Appl Biosci. 1996;12:391–398
  11. Chen QK, Hertz GZ, Stormo GD. PromFD 1.0: a computer program that predicts eukaryotic pol II promoters using strings and IMD matrices. Comp Appl Biosci. 1997;13:29–35
  12. Scherf M, Klingenhoff A, Werner T. Highly specific localization of promoter regions in large genomic sequences by PromoterInspector: a novel context analysis approach. J Mol Biol. 2000;297:599–606
  13. Bajic VB, Seah SH, Chong A, Krishnan SPT, Koh JLY, Brusic V. Computer model for recognition of functional transcription start sites in RNA Polymerase II promoters of vertebrates. J Mol Graph Model. 2003;21:323–332
  14. Hannenhalli S, Levy S. Promoter prediction in the human genome. Bioinformatics. 2001;17(Suppl 1):90–96
  15. Davuluri RV, Grosse I, Zhang MQ. Computational identification of promoters and first exons in the human genome. Nat Genet. 2001;29:412–417
  16. Bajic VB, Seah SH. Dragon gene start finder identifies approximate locations of the 5′ ends of genes. Nucleic Acids Res. 2003;31:3560–3563
  17. Audic S, Claverie JM. Detection of eukaryotic promoters using Markov transition matrices. Comput Chem. 1997;21:223–227
  18. Ohler U, Harbeck S, Niemann H, Noeth E, Reese MG. Interpolated Markov chains for eukaryotic promoter recognition. Bioinformatics. 1999;15:362–369
  19. Ohler U, Liao GC, Niemann H, Rubin GM. Computational analysis of core promoters in the Drosophila genome. Genome Biol 2002;3:research0087.1–0087.12.
  20. Schmid CD, Praz V, Delorenzi M, Périer R, Bucher P. The eukaryotic promoter database EPD: the impact of in silico primer extension. Nucleic Acids Res. 2004;32:82–85
  21. Helden J, Andre B, Collado-Vides J. Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. J Mol Biol. 1998;281:827–842
  22. Bajic VB, Choudhary V, Hock CK. Content analysis of the core promoter region of human genes. In Silico Biol. 2003;4:0011
  23. Pedersen AG, Baldi P, Chauvin Y, Brunak S. The biology of eukaryotic promoter prediction - a review. Comput Chem. 1999;23:191–207
  24. Zhang MQ. Computational methods for promotor recognition. In:  Jiang T,  Xu Y,  Zhang MQ editor. Current topics in computational molecular biology. Cambridge, Massachusetts: MIT Press; 2002;p. 249–268
  25. Smale ST, Kadonaga JT. The RNA polymerase II core promoter. Ann Rev Biochem. 2003;72:449–479
  26. Suzuki Y, Tsunoda T, Sese J, Taira H, et al. Identification and characterization of the potential promoter regions of 1031 kinds of human genes. Genome Res. 2001;11:677–684
  27. Bowden AC. Nomenclature for incompletely specified bases in nucleic acid sequences: recommendations. Nucleic Acids Res. 1985;13:3021–3030
  28. Jensen FV. Bayesian networks and decision graphs. New York: Springer Verlag; 2001;
  29. Friedman N, Geiger D, Goldszmidt M. Bayesian network classifiers. Mach Learn. 1997;29:131–163
  30. Carlin BP, Louis TA. Bayes and empirical bayes methods for data analysis. Florida: Chapman and Hall; 2000;
  31. Verbeek JJ, Vlassis N, Kraose B. Efficient greedy learning of Gaussian mixture models. Neural Comput. 2003;15:469–485
  32. Collins JE, Goward ME, Cole CG, Smink LJ, Huckle EJ, Knowles S, et al. Reevaluating human gene annotation: a second-generation analysis of chromosome 22. Genome Res. 2003;13:27–36
  33. Scherf M, et al. First pass annotation of promoters on human chromosome 22. Genome Res. 2001;11:333–340
  34. Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Rapp BA, Wheeler DL. GenBank. Nucleic Acids Res. 2002;30:17–20
  35. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–410
  36. Crowley EM, Roeder K, Bina M. A statistical model for locating regulatory regions in genomic DNA. J Mol Biol. 1997;268:8–14
  37. Domingos P, Pazzani M. Beyond independence: conditions for the optimality of the simple Bayesian classifier. In: Proceedings of the 13th International Conference on Machine Learning. Bari, Italy, 1996. San Francisco, California: Morgan Kaufmann Publishers Inc; 1996;p. 105–112

 Availability: Binary executable of the promoter prediction model, named BayesProm, is available at: http://www.comp.nus.edu.sg/∼bioinfo/BayesProm (accessed: 1 May 2005).

PII: S0933-3657(05)00055-2

doi: 10.1016/j.artmed.2005.02.005

Artificial Intelligence in Medicine
Volume 35, Issue 1 , Pages 107-119 , September 2005