Abstract
Existing clustering techniques provide clusters from time series microarray data, but the distance metrics used lack interpretability for these types of data. While some previous methods are concerned with matching levels, of interest are genes that behave in the same manner but with varying levels. These are not clustered together using an Euclidean metric, and are indiscernible using a correlation metric, so we propose a more appropriate metric and modified hierarchical clustering method to highlight those genes of interest. Use of hashing and bucket sort allows for fast clustering and the hierarchical dendrogram allows for direct comparison with easily understood meaning of the distance. The method also extends well to use k-means clustering when a desired number of clusters are known.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Spellman, P.T. et al. (1998) Comprehensive identification of cell cycle-regulated genes of the yeast Saccaromyces cerevisiae by microarray hybridization. Mol. Biol. Cell, 9, 3273–3297.
Eisen, M.B. et al. (1998) Cluster analysis and display of genome-wide expression patterns. Proc. Nat’l Acad. Sci. USA, 95(25):14863-8.
Zou, M. and Conzen, S.D. (2005) A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data. Bioinformatics, 21, 71–79.
Hartigan, J.A. and Wong, M.A. (1979) A k-means clustering algorithm. Appl. Stat., 28, 100–108.
Bhattacharya, A. and De, R.K. (2008) Divisive Correlation Clustering Algorithm (DCCA) for grouping of genes: detecting varying patterns in expression profiles. Bioinformatics, 24, 1359–1366.
Kim, J. and Kim H. (2008) Clustering of change patterns using Fourier coefficients. Bioinformatics, 24, 184–191.
Park, T. et al. (2003) Statistical tests for identifying differentially expressed genes in time-course microarray experiments. Bioinformatics, 19, 694–703.
Ernst, J. et al. (2005) Clustering short time series gene expression data. Bioinformatics, 21, 159–168.
Phang T.L., Neville, M.C., Rudolph, M. and Hunter, L. (2003) Trajectory clustering: a non-parametric method for grouping gene expression time courses, with applications to mammary development. Pacific Symposium on Biocomputing, 351–362.
Dobosiewicz, W. (1978) Sorting by Distributive Partition. Information Processing Letters, 7, 1–6.
Bréhélin, L., Gascuel1 O. and Martin O. (2008) Using repeated measurements to validate hierarchical gene clusters. Bioinformatics, 24, 682–688.
Alabady, M.S., Youn, E. and Wilkins, T.A. (2008) Double feature selection and cluster analyses in mining of microarray data from cotton. BMC Genomics, 9, 295.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Koenig, L., Youn, E. (2011). Hierarchical Signature Clustering for Time Series Microarray Data. In: Arabnia, H., Tran, QN. (eds) Software Tools and Algorithms for Biological Systems. Advances in Experimental Medicine and Biology, vol 696. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-7046-6_6
Download citation
DOI: https://doi.org/10.1007/978-1-4419-7046-6_6
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-7045-9
Online ISBN: 978-1-4419-7046-6
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)