A Semiparametric Bayesian Method of Clustering Genes Using Time-Series of Expression Profiles

  • Arvind K. Jammalamadaka
  • Kaushik Ghosh


An increasing number of microarray experiments look at expression levels of genes over the course of several points in time. In this article, we present two models for clustering such time series of expression profiles. We use nonparametric Bayesian methods which make the models robust to misspecifications and provide a natural framework for clustering of the genes through the use of Dirichlet process priors. Unlike other clustering techniques, the resulting number of clusters is completely data driven. We demonstrate the effectiveness of our methodology using simulation studies with artificial data as well as through an application to a real data set.


Dirichlet Process Heteroscedastic Model Markov Chain Monte Carlo Procedure Dirichlet Process Mixture Model Time Series Gene Expression 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Antoniak CE (1974) Mixtures of Dirichlet processes with applications to nonparametric problems. Ann Stat 2:1152–1174MATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Bar-Joseph Z, Gerber G, Jaakkola T, Gifford D, Simon I (2003) Continuous representations of time series gene expression data. J Comput Biol 3:341–356CrossRefGoogle Scholar
  3. 3.
    Dahl D (2006) Model-based clustering for expression data via a Dirichlet process mixture model. In: Bayesian inference for gene expression and proteomics. Cambridge University Press, Cambridge, pp 201–218Google Scholar
  4. 4.
    Escobar M, West M (1995) Bayesian density estimation and inference using mixtures. J Am Stat Assoc 90(430):577–588MATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Ferguson TS (1973) A Bayesian analysis of some nonparametric problems. Ann Stat 1(2): 209–230MATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Gilks WR, Best NG, Tan KKC (1995) Adaptive rejection metropolis sampling. Appl Stat 44: 455–472MATHCrossRefGoogle Scholar
  7. 7.
    Gilks WR, Wild P (1992) Adaptive rejection sampling for Gibbs sampling. Appl Stat 41(2): 337–348MATHCrossRefGoogle Scholar
  8. 8.
    Liu X, Sivaganesan S, Yeung K, Guo J, Baumgarner RE, Medvedovic M (2006) Context-specific infinite mixtures for clustering gene expression profiles across diverse microarray data set. Bioinformatics 22:1737–1744CrossRefGoogle Scholar
  9. 9.
    Medvedovic M, Sivaganesan S (2002) Bayesian infinite mixture model based clustering of gene expression profiles. Bioinformatics 18(9):1194–1206CrossRefGoogle Scholar
  10. 10.
    Medvedovic M, Yeung KY, Baumgarner R (2004) Bayesian mixture model based clustering of replicated microarray data. Bioinformatics 20:1222–1232CrossRefGoogle Scholar
  11. 11.
    Neal RM (2000) Markov chain sampling methods for Dirichlet process mixture models. J Comput Graph Stat 9(2):249–265CrossRefMathSciNetGoogle Scholar
  12. 12.
    Qian J, Dolled-Filhart M, Lin J, Yu H, Gerstein M (2001) Beyond synexpression relationships: local clustering of time-shifted and inverted gene expression profiles identifies new, biologically relevant interactions. J Mol Biol 314:1053–1066CrossRefGoogle Scholar
  13. 13.
    Singh R, Palmer N, Gifford D, Berger B, Bar-Joseph Z (2005) Active learning for sampling in time-series experiments with application to gene expression analysis. In: ICML ’05: proceedings of the 22nd international conference on Machine learning. ACM, New York, pp 832–839Google Scholar
  14. 14.
    Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Botstein D, Futcher B (1998) Comprehensive identification of cell cycle-regulated genes of the yeast saccharomyces cervisiae by microarray hybridization. Mol Biol Cell 9(12):3273–3297Google Scholar
  15. 15.
    Yuan M, Kendziorski C (2006) Hidden Markov models for microarray time course data in multiple biological conditions. J Am Stat Assoc 101(476):1323–1332MATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  1. 1.Computer Science and Artificial Intelligence LaboratoryMassachusetts Institute of TechnologyCambridgeUSA

Personalised recommendations