Hidden Markov Model and Its Applications in Motif Findings

  • Jing Wu
  • Jun Xie
Part of the Methods in Molecular Biology book series (MIMB, volume 620)


Hidden Markov models have wide applications in pattern recognition. In genome sequence analysis, hidden Markov models (HMMs) have been applied to the identification of regions of the genome that contain regulatory information, i.e., binding sites. In higher eukaryotes, the regulatory information is organized into modular units called cis-regulatory modules. Each module contains multiple binding sites for a specific combination of several transcription factors. In this chapter, we gave a brief review of hidden Markov models, standard algorithms from HMM, and their applications to motif findings. We then introduce the application of HMM to a complex system in which an HMM is combined with Bayesian inference to identify transcription factor binding sites and cis-regulatory modules.

Key words

Binding site cis-regulatory module hidden Markov model motif 


  1. 1.
    Crowley, E.M., Roeder, K., and Bina, M. (1997) A statistical model for locating regulatory regions in genomic DNA. J Mol Biol 268, 8–14.PubMedCrossRefGoogle Scholar
  2. 2.
    Frith, M.C., Hansen, U., and Weng, Z. (2001) Detection of cis-element clusters in higher eukaryotic DNA. Bioinformatics 17, 878–889.PubMedCrossRefGoogle Scholar
  3. 3.
    Bailey, T.L. and Noble, W.S. (2003) Searching for statistically significant regulatory modules. Bioinformatics 19, (Suppl. 2), ii16–ii25.PubMedCrossRefGoogle Scholar
  4. 4.
    Rajewsky, N., Vergassola, M., Gaul, U., and Siggia, E.D. (2002) Computational detection of genomic cis-regulatory modules applied to body patterning in the early Drosophila embryo. BMC Bioinformatics 3, 30.Google Scholar
  5. 5.
    Sinha, S., van Nimwegen, E., and Siggia, E.D. (2003) A probabilistic method to detect regulatory modules. Bioinformatics 19, (Suppl. 1), i292–i301.PubMedCrossRefGoogle Scholar
  6. 6.
    Wu, J. and Xie, J. (2008) Computation-Based Discovery of Cis-Regulatory Modules by Hidden Markov Model. J Comput Biol 15(3), 279–290.PubMedCrossRefGoogle Scholar
  7. 7.
    Rabiner, L. and Juang, H. (1993) Fundamentals of Speech Recognition. Prentice Hall, USA.Google Scholar
  8. 8.
    Durbin, R., Eddy, S., Krogh, A., and Mitchison, G. (1998) Biological Sequence Analysis. Cambridge University Press, Cambridge, UK.CrossRefGoogle Scholar
  9. 9.
    Baum, L.E. (1972) An equality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes. Inequalities 3, 1–8.Google Scholar
  10. 10.
    Yuh, C.H., Bolouri, H., and Davidson, E.H. (1998) Genomic cis-regulatory logic: Experimental and computational analysis of a sea urchin gene. Science 279, 1896–1902.PubMedCrossRefGoogle Scholar
  11. 11.
    Wingender, E., Chen, X., Hehl, R., et al. (2000) TRANSFAC: an integrated system for gene expression regulation. Nucleic Acids Res 28, 316–319.PubMedCrossRefGoogle Scholar

Copyright information

© Humana Press, a part of Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Jing Wu
    • 1
  • Jun Xie
    • 2
  1. 1.Department of StatisticsCarnegie Mellon UniversityPittsburghUSA
  2. 2.Department of StatisticsPurdue UniversityWest LafayetteUSA

Personalised recommendations