Frequent Episode Mining to Support Pattern Analysis in Developmental Biology

  • Ronnie Bathoorn
  • Monique Welten
  • Michael Richardson
  • Arno Siebes
  • Fons J. Verbeek
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6282)


We introduce a new method for the analysis of heterochrony in developmental biology. Our method is based on methods used in data mining and intelligent data analysis and applied in, e.g., shopping basket analysis, alarm network analysis and click stream analysis. We have transferred, so called, frequent episode mining to operate in the analysis of developmental timing of different (model) species. This is accomplished by extracting small temporal patterns, i.e. episodes, and subsequently comparing the species based on extracted patterns. The method allows relating the development of different species based on different types of data. In examples we show that the method can reconstruct a phylogenetic tree based on gene-expression data as well as using strict morphological characters. The method can deal with incomplete and/or missing data. Moreover, the method is flexible and not restricted to one particular type of data: i.e., our method allows comparison of species and genes as well as morphological characters based on developmental patterns by simply transposing the dataset accordingly. We illustrate a range of applications.


frequent episode mining heterochrony pattern analysis developmental biology 


  1. 1.
  2. 2.
  3. 3.
  4. 4.
    Bathoorn, R., Siebes, A.: Constructing (Almost) Phylogenetic Trees from Developmental Sequences Data. In: 8th European Conf. on Principles and Practice of Knowledge Discovery in Databases, pp. 500–502 (2004)Google Scholar
  5. 5.
    Belmamoune, M., Verbeek, F.J.: Heterogeneous Information Systems: bridging the gap of time and space. In: Management and retrieval of spatio-temporal Gene Expression data, InScit 2006, Merida, Spain (2006)Google Scholar
  6. 6.
    Belmamoune, M., Verbeek, F.J.: Data Integration for Spatio-Temporal Patterns of Gene Expression of Zebrafish development: the GEMS database. J. of Integrative BioInformatics 5(2), 92 (2008)Google Scholar
  7. 7.
    Belmamoune, M., Potikanond, D., Verbeek, F.J.: Mining and analysing spatio-temporal patterns of gene expression in an integrative database framework. J. of Integrative Bioinformatics 7(3), 128 (2010)Google Scholar
  8. 8.
    Bininda-Emonds, O.R.P., Jefferey, J.E., Richardson, M.K.: Is sequence heterochrony an important evolutionary mechanism in mammals? J. of Mammalian Evolution 10(4), 335–361 (2003)CrossRefGoogle Scholar
  9. 9.
    Jaccard, P.: Nouvelles recherches sur la distribution florale. Bull Soc. Vaudoise Sci. Nat. 44, 223–227 (1908)Google Scholar
  10. 10.
    Johnson, S.C.: Hierarchical Clustering Schemes. Psychometrika 2, 241–254 (1967)CrossRefGoogle Scholar
  11. 11.
    Jeffery, J.E., Bininda-Emonds, O.R.P., Coates, M.I., Richardson, M.K.: Analyzing evolutionary patterns in amniote embryonic development. Evolution & Development 4(4), 292–302 (2002)CrossRefGoogle Scholar
  12. 12.
    Jeffery, J.E., Richardson, M.K., Coates, M.I., Bininda-Emonds, O.R.P.: Analyzing Developmental Sequences within a Phylogenetic Framework. Systematic Biology 51(3), 478–491 (2002)CrossRefPubMedGoogle Scholar
  13. 13.
    Mannila, H., Toivonen, H., Verkamo, A.I.: Discovering frequent episodes in sequences. In: 1st Int. Conf. on Knowledge Discovery and Data Mining, pp. 210–215 (1995)Google Scholar
  14. 14.
    Metscher, B.D., Ahlberg, P.E.: Zebrafish in Context: Use of a Laboratory Model in Comparative Studies. Develomental Biology 210, 1–14 (1999)CrossRefGoogle Scholar
  15. 15.
    Schulmeister, S., Wheeler, W.C.: Comparative and Phylogenetic analysis of developmental sequences. Evolution & Development 6(1), 50–57 (2004)CrossRefGoogle Scholar
  16. 16.
    Smith, K.K.: Sequence heterochrony and the evolution of development. Journal of morphology 252, 82–97 (2002)CrossRefPubMedGoogle Scholar
  17. 17.
    Smith, K.K.: Time’s arrow: heterochrony and the evolution of development. Int. J. Dev. Biol. 47, 613–621 (2003)PubMedGoogle Scholar
  18. 18.
    Schlosser, G.: Using heterochrony plots to detect the dissociated coevolution of characters. Journal of experimental zoology (mol dev evol) 291, 282–304 (2001)CrossRefGoogle Scholar
  19. 19.
    Verbeek, F.J., Lawson, K.A., Bard, J.B.L.: Developmental BioInformatics: linking genetic data to virtual embryos. Int. J. Dev. Biol. 43, 761–771 (1999)PubMedGoogle Scholar
  20. 20.
    Verbeek, F.J., Rodrigues, D.D., Spaink, H., Siebes, A.: Data submission of 3D image sets to a bio-molecular database using active shape models and a 3D reference model for projection. In: Proceedings SPIE, Internet Imaging V, vol. 5304, pp. 13–23 (2004)Google Scholar
  21. 21.
    Welten, M.C.M.: Spatio-temporal gene expression analysis from 3D in situ hybridisation images. PhD Thesis, Leiden University (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Ronnie Bathoorn
    • 2
  • Monique Welten
    • 1
  • Michael Richardson
    • 1
  • Arno Siebes
    • 2
  • Fons J. Verbeek
    • 1
  1. 1.Imaging & BioInformatics, LIACSLeiden UniversityThe Netherlands
  2. 2.Distributed Databases, Computer ScienceUtrecht UniversityThe Netherlands

Personalised recommendations