Abstract
This chapter presents a Bayesian method for model-based clustering of gene expression dynamics and a program implementing it. The method represents gene expression dynamics as autoregressive equations and uses an agglomerative procedure to search for the most probable set of clusters, given the available data. The main contributions of this approach are the ability to take into account the dynamic nature of gene expression time series during clustering and an automated, principled way to decide when two series are different enough to belong to different clusters. The reliance of this method on an explicit statistical representation of gene expression dynamics makes it possible to use standard statistical techniques to assess the goodness of fit of the resulting model and validate the underlying assumptions. A set of gene expression time series, collected to study the response of human fibroblasts to serum, is used to illustrate the properties of the method and the functionality of the program.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Aach J, Church GM (2001). Aligning gene expression time series with time warping algorithms. Bioinformatics, 17:495–508.
Akaike H (1973). Information theory and an extension of the maximum likelihood principle. In 2nd International Symposium on Information Theory, pp 267–281, Budapest, Hu. Kiado.
Alter O, Brown PO, Botstein D (2000). Singular value decomposition for genomewide expression data processing and modeling. Proceedings of the National Academy of Sciences USA, 97:10101–10106.
Bernardo JM, Smith AFM (1994). Bayesian Theory. Wiley, New York, NY.
Box GEP, Jenkins GM (1976). Time Series Analysis: Forecasting and Control. Holden-Day, San Francisco, CA.
Butte AJ, Tamayo P, Slonim D, Golub TR, Kohane IS (2000). Discovering functional relationships between rna expression and chemotherapeutic susceptibility using relevance networks. Proceedings of the National Academy of Sciences USA, 97:12182–12186.
Eisen M, Spellman P, Brown P, Botstein D (1998). Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences USA, 95:14863–14868.
Fernandez-Silva P, Martinez-Azorin F, Micol V, Attardi G (1997). The human mitochondrial transcription termination factor (mterf) is a multizipper protein but binds to DNA as a monomer, with evidence pointing to intramolecular leucine zipper interactions. Embo J, 16(5):1066–79.
Gilovich T, Vallone R, Tversky A (1985). The hot hand in basketball: On the misperception of random sequences. Cognitive Psychology, 17:295–314.
Golub R, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Loh IML, Coller H, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES (1999). Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science, 286:531–537.
Haimowitz IJ, Le PP, Kohane IS (1995) Clinical monitoring using regressionbased trend templates. Artif Intell Med, 7(4):471–472.
Holter NS, Maritan A, Cieplak M, Fedoroff NV, Banavar JR (2001). Dynamic modeling of gene expression data. Proceedings of the National Academy of Sciences USA, 98(4):1693–1698.
Iyer VR, Eisen MB, Ross DT, Schuler T, Moore G, Lee JM, Trent JC, Staudt LM, Hudson J, Boguski MS, Lashkari D, Shalon D, Botstein D, Brown PO (1999). The transcriptional program in the response of human fibroblasts to serum. Science, 283:83–7.
Kahneman D, Slovic P, Tversky A (1982). Judgment under Uncertainty: Hueristic and Biases. Cambridge University Press, New York, NY.
Kass RE, Raftery A (1995). Bayes factors. Journal of the American Statistical Association, 90:773–795.
Lockhart DJ, Dong H, Byrne MC, Follettie MT, Gallo MV, Chee MS, Mittmann M, Wang C, Kobayashi M, Horton H, Brown EL (1996). Expression monitoring by hybridization to high-density oligonucleotide arrays. Natural Biotechnology, 14:1675–1680.
Lossos IS, Alizadeh AA, Eisen MB, Chan WC, Brown PO, Botstein D, Staudt LM, Levy R (2000). Ongoing immunoglobulin somatic mutation in germinal center b cell-like but not in activated B cell-like diffuse large cell lymphomas. Proceedings of the National Academy of Sciences USA, 97(18):10209–10213.
MacKay DJC (1992). Bayesian interpolation. Neural Computing, 4:415–447.
Ramoni M, Sebastiani P, Cohen PR (2000). Multivariate clustering by dynamics. In Proceedings of the 2000 National Conference on Artificial Intelligence (AAAI-2000), pp 633–638, San Francisco, CA. Morgan Kaufmann.
Ramoni M, Sebastiani P, Cohen PR (2002). Bayesian clustering by dynamics. Mach Learn, 47(1):91–121.
Ramoni M, Sebastiani P, Kohane IS (2002). Cluster analysis of gene expression dynamics. Proceedings of the National Academy of Sciences USA, 99(14):9121–6.
Reis BY, Butte AS, Kohane IS (2001). Extracting knowledge from dynamics in gene expression. J Biomed Inform, 34(1):15–27.
Schena M, Shalon D, Davis RW, Brown PO (1995). Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science, 270:467–70.
Sebastiani P, Ramoni M (2001). Common trends in European school populations. Research in Official Statistics, 4(1):169–183.
Shahar Y, Tu S, Musen M (1992). Knowledge acquisition for temporal abstraction mechanisms. Knowl Acquis, 1(4):217–236.
Tamayo P, Slonim D, Mesirov J, Zhu Q, Kitareewan S, Dmitrovsky E, Lander ES, Golub TR (1999). Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation. Proceedings of the National Academy of Sciences USA, 96:2907–2912.
Tenenbaum JB, Griffiths TL (2001). Generalization, similarity, bayesian inference. Behavi Brain Sci, 24(3).
Tversky A, Kahneman D (1974). Judgment under uncertainty: Heuristics and biases. Science, 185:1124–1131.
Wen X, Fuhrman S, Michaels GS, Carr DB, Smith S, Barker JL, Somogyi R (1998). Large-scale temporal gene expression mapping of central nervous system development. Proceedings of the National Academy of Sciences USA, 95:334–339.
West M, Harrison J (1997). Bayesian Forecasting and Dynamic Models. Springer, New York, NY.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag New York, Inc.
About this chapter
Cite this chapter
Sebastiani, P., Ramoni, M., Kohane, I.S. (2003). Bayesian Clustering of Gene Expression Dynamics. In: Parmigiani, G., Garrett, E.S., Irizarry, R.A., Zeger, S.L. (eds) The Analysis of Gene Expression Data. Statistics for Biology and Health. Springer, New York, NY. https://doi.org/10.1007/0-387-21679-0_18
Download citation
DOI: https://doi.org/10.1007/0-387-21679-0_18
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-95577-3
Online ISBN: 978-0-387-21679-9
eBook Packages: Springer Book Archive