Microarray Time-Series Data Clustering via Multiple Alignment of Gene Expression Profiles

  • Numanul Subhani
  • Alioune Ngom
  • Luis Rueda
  • Conrad Burden
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5780)


Genes with similar expression profiles are expected to be functionally related or co-regulated. In this direction, clustering microarray time-series data via pairwise alignment of piece-wise linear profiles has been recently introduced. We propose a k-means clustering approach based on a multiple alignment of natural cubic spline representations of gene expression profiles. The multiple alignment is achieved by minimizing the sum of integrated squared errors over a time-interval, defined on a set of profiles. Preliminary experiments on a well-known data set of 221 pre-clustered Saccharomyces cerevisiae gene expression profiles yields excellent results with 79.64% accuracy.


Microarrays Time-Series Data Gene Expression Profiles Profile Alignment Cubic Spline k-Means Clustering 


  1. 1.
    Cho, R., Campbell, M., Winzeler, E., Steinmetz, L., Conway, A., Wodicka, L., Wolfsberg, T., Gareilian, A., Lockhart, D., Davis, R.: A genome-wide transactional analysis of the mitotic cell cycle. Molecular Cell 2(1), 65–73 (1998)CrossRefPubMedGoogle Scholar
  2. 2.
    Bar-Joseph, Z., Gerber, G., Jaakkola, T., Gifford, D., Simon, I.: Continuous representations of time series gene expresion data. Journal of Comp. Biology 10(3-4) (2003)Google Scholar
  3. 3.
    Bréhélin, L.: Clustering gene expression series with prior knowledge. In: Casadio, R., Myers, G. (eds.) WABI 2005. LNCS (LNBI), vol. 3692, pp. 27–38. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  4. 4.
    Chu, S., DeRisi, J., Eisen, M., Mulholland, J., Botstein, D., Brown, P., Herskowitz, I.: The transcriptional program of sporulation in budding yeast. Science 282, 699–705 (1998)CrossRefPubMedGoogle Scholar
  5. 5.
    Djean, S., Martin, P., Baccini, A., Besse, P.: Clustering time-series gene expression data using smoothing spline derivatives. EURASIP Journal on Bioinformatics and Systems Biology 70561, 705–761 (2007)Google Scholar
  6. 6.
    Ernst, J., Nau, G., Bar-Joseph, Z.: Clustering short time series gene expression data. Bioinformatics 21(suppl. 1), i159–i168 (2005)CrossRefGoogle Scholar
  7. 7.
    Ramoni, M., Sebastiani, P., Kohane, I. (eds.): Cluster analysis of gene expression dynamics. Proc. Natl. Acad. Sci. USA 99 (2002)Google Scholar
  8. 8.
    Tavazoie, S., Hughes, J., Campbell, M., Cho, R., Church, G.: Systematic determination of genetic network architecture. Nature Genetics 22, 281–285 (1999)CrossRefPubMedGoogle Scholar
  9. 9.
    Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., Lander, E., Golub, T. (eds.): Interpreting patterns of gene expression with SOMs: Methods and application to hematopoietic differentiation, vol. 96 (1999)Google Scholar
  10. 10.
    Heyer, L., Kruglyak, S., Yooseph, S.: Exploring expression data: identification and analysis of coexpressed genes. Genome Research 9, 1106–1115 (1999)CrossRefPubMedPubMedCentralGoogle Scholar
  11. 11.
    Moller-Levet, C., Klawonn, F., Cho, K., Wolkenhauer, O.: Clustering of unevenly sampled gene expression time-series data. Fuzzy sets and Systems 152(1-16), 49–66 (2005)CrossRefGoogle Scholar
  12. 12.
    Peddada, S., Lobenhofer, E., Li, L., Afshari, C., Weinberg, C., Umbach, D.: Gene selection and clustering for time-course and dose-response microarray experiments using order-restricted inference. Bioinformatics 19(7), 834–841 (2003)CrossRefPubMedGoogle Scholar
  13. 13.
    Rueda, L., Bari, A., Ngom, A.: Clustering time-series gene expression data with unequal time intervals. In: Priami, C., Dressler, F., Akan, O.B., Ngom, A. (eds.) Transactions on Computational Systems Biology X. LNCS (LNBI), vol. 5410, pp. 100–123. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  14. 14.
    Xu, R., Wunsch, D.: Clustering. Wiley-IEEE Press, Chichester (2008)CrossRefGoogle Scholar
  15. 15.
    Roth, V., Laub, J., Kawanabe, M., Buhmann, J.: Optimal cluster preserving embedding of nonmetric proximity data. IEEE Trans. on Pattern Analysis and Machine Intelligence 25(12), 1540–1551 (2003)CrossRefGoogle Scholar
  16. 16.
    Kuhn, H.: The hungarian method for the assignment problem. Naval Research Logistics 52(1), 7–21 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Numanul Subhani
    • 1
  • Alioune Ngom
    • 1
  • Luis Rueda
    • 1
  • Conrad Burden
    • 2
  1. 1.School of Computer Science, 5115 Lambton TowerUniversity of WindsorWindsorCanada
  2. 2.Centre for Bioinformation Science, Mathematical Sciences Institute and John Curtin School of Medical ResearchThe Australian National UniversityCanberraAustralia

Personalised recommendations