The BioKET Biodiversity Data Warehouse: Data and Knowledge Integration and Extraction

  • Somsack Inthasone
  • Nicolas Pasquier
  • Andrea G. B. Tettamanzi
  • Célia da Costa Pereira
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8819)


Biodiversity datasets are generally stored in different formats. This makes it difficult for biologists to combine and integrate them to retrieve useful information for the purpose of, for example, efficiently classify specimens. In this paper, we present BioKET, a data warehouse which is a consolidation of heterogeneous data sources stored in different formats. For the time being, the scopus of BioKET is botanical. We had, among others things, to list all the existing botanical ontologies and relate terms in BioKET with terms in these ontologies. We demonstrate the usefulness of such a resource by applying FIST, a combined biclustering and conceptual association rule extraction method on a dataset extracted from BioKET to analyze the risk status of plants endemic to Laos. Besides, BioKET may be interfaced with other resources, like GeoCAT, to provide a powerful analysis tool for biodiversity data.


Biodiversity Information Technology Ontologies Knowledge Integration Data Mining 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Benniamin, A., Irudayaraj, V., Manickam, V.S.: How to identify rare and endangered ferns and fern allies. Ethnobotanical Leaflets 12, 108–117 (2008)Google Scholar
  2. 2.
    Biodiversity informatics and co-operation in taxonomy for interactive shared knowledge base (BIOTIK), (accessed September 2011)
  3. 3.
    Botanical research and herbarium management system (BRAHMS), (accessed January 2013)
  4. 4.
  5. 5.
    De Craenel, L.R., Wanntorp, L.: Floral development and anatomy of salvadoraceae. Ecological Applications 104(5), 913–923 (2009)Google Scholar
  6. 6.
    Eldredge, N.: Life on Earth: An Encyclopedia of Biodiversity, Ecology, and Evolution, Life on Earth, vol. 1. ABC-CLIO (2002)Google Scholar
  7. 7.
    Fritsch, P.W., Bush, C.M.: A new species of gaultheria (ericaceae) from mount kinabalu, borneo, malaysia. Novon: A Journal for Botanical Nomenclature 21(3), 338–342 (2011), CrossRefGoogle Scholar
  8. 8.
    Geocat: Geospatial conservation assessment tool, (accessed April 2014)
  9. 9.
    Global biodiversity outlook 3, (accessed January 2013)
  10. 10.
    Grillo, O., Venora, G. (eds.): Biological Diversity and Sustainable Resources Use. InTech (2011)Google Scholar
  11. 11.
    Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann Publishers Inc., San Francisco (2011)Google Scholar
  12. 12.
    Hochachka, W.M., Caruana, R., Fink, D., Munson, A., Riedewald, M., Sorokina, D., Kellings, S.: Data-mining discovery of pattern and process in ecological systems. The Journal of Wildlife Management 71(7), 2427–2437 (2007)CrossRefGoogle Scholar
  13. 13.
    Institute, W.R.: Ecosystems and human well-being: Biodiversity synthesis. Millennium Ecosystem Assessment (2005)Google Scholar
  14. 14.
    Marbán, O., Mariscal, G., Segovia, J.: A data mining & knowledge discovery process model. In: Data Mining and Knowledge Discovery in Real Life Applications, InTech, Vienna (2009)Google Scholar
  15. 15.
    Mariscal, G., Marbán, O., Fernández, C.: A survey of data mining and knowledge discovery process models and methodologies. The Knowledge Engineering Review 25(2), 137–166 (2010), CrossRefGoogle Scholar
  16. 16.
    Midgley, G.: Biodiversity and ecosystem function. Science 335(6065), 174–175 (2012), CrossRefGoogle Scholar
  17. 17.
    Mondal, K.C., Pasquier, N., Mukhopadhyay, A., Maulik, U., Bandyopadhyay, S.: A new approach for association rule mining and bi-clustering using formal concept analysis. In: MLDM 2012, pp. 86–101 (2012)Google Scholar
  18. 18.
    Natural products information system (NAPIS), (accessed February 2013)
  19. 19.
    Obrst, L.: Ontologies for semantically interoperable systems. In: CIKM 2003, pp. 366–369 (2003),
  20. 20.
    Peters, C., Peters, D., Cota-Sánchez, J.: Data mining and mapping of herbarium specimens using geographic information systems: A look at the biodiversity informatics project of the W. P. Fraser Herbarium, SASK (2009),,%20CBA%202009.pdf
  21. 21.
    Rahangdale, S.S., Rahangdale, S.R.: Plant species composition on two rock outcrops from the northern western ghats, maharashtra, india. Journal of Threatened Taxa 6(4), 5593–5612 (2014)CrossRefGoogle Scholar
  22. 22.
    Shah, A.: Why Is Biodiversity Important? Who Cares? Global Issues (April 2011),
  23. 23.
    So, N.V.: The potential of local tree species to accelerate natural forest succession on marginal grasslands in southern vietnam,
  24. 24.
    Spehn, E.M., Korner, C. (eds.): Data Mining for Global Trends in Mountain Biodiversity. CRC Press (2009)Google Scholar
  25. 25.
    Talent, J.: Earth and Life: Global Biodiversity, Extinction Intervals and Biogeographic Perturbations Through Time. International Year of Planet Earth. Springer (2012)Google Scholar
  26. 26.
    The convention on biological diversity (CBD), (accessed September 2013)
  27. 27.
    The IUCN Red List of Threatened Species, (accessed January 2014)
  28. 28.
    Whetzel, P., Noy, N., Shah, N., Alexander, P., Nyulas, C., Tudorache, T., Musen, M.: What are ontologies (accessed March 2013),
  29. 29.
    Wickneswari, R.: Hopea odorata roxb,

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Somsack Inthasone
    • 1
  • Nicolas Pasquier
    • 1
  • Andrea G. B. Tettamanzi
    • 1
  • Célia da Costa Pereira
    • 1
  1. 1.Univ. Nice Sophia Antipolis, CNRS, I3S, UMR 7271Sophia AntipolisFrance

Personalised recommendations