BioHIPI: Biomedical Hadoop Image Processing Interface

  • Francesco Calimeri
  • Mirco Caracciolo
  • Aldo MarzulloEmail author
  • Claudio Stamile
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10710)


Nowadays, the importance of collecting large amounts of data is becoming increasingly crucial, along with the application of efficient and effective analysis techniques, in many areas. One of the most important field in which Big Data is becoming of fundamental importance is the biomedical domain, also due to the decreasing cost of acquiring and analyzing biomedical data. Furthermore, the emergence of more accessible technologies and the increasing speed-up of algorithms, also thanks to parallelization techniques, is helping at making the application of Big Data in healthcare a fast-growing field.

This paper presents a novel framework, Biomedical Hadoop Image Processing Interface (BioHIPI), capable of storing biomedical image collections in a Distributed File System (DFS) for exploiting the parallel processing of Big Data on a cluster of machines. The work is based on the Apache Hadoop technology and makes use of the Hadoop Distributed File System (HDFS) for storing images, the MapReduce libraries for parallel programming for processing, and Yet Another Resource Negotiator (YARN) to run processes on the cluster.


Big Data Hadoop Image processing 



Claudio Stamile is funded by an EU MC ITN TRANSACT 2012 (316679) project. Francesco Calimeri has been partially supported by the Italian Ministry for Economic Development (MISE) under project “PIUCultura – Paradigmi Innovativi per l’Utilizzo della Cultura” (n. F/020016/01-02/X27), and by the EU under project “Smarter Solutions in the Big Data World (S2BDW)” (n. F/050389/01-03/X32) funded within the call “HORIZON2020” PON I&C 2014-2020.


  1. 1.
    Henschen, D.: Emerging Options: MapReduce, Hadoop: Young, But Impressive. Information Week (2010). 24Google Scholar
  2. 2.
    Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google file system. In: Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP 2003), pp. 29–43 (2003)Google Scholar
  3. 3.
    Schindelin, J., Rueden, C.T., Hiner, M.C., Eliceiri, K.W.: The ImageJ ecosystem: an open platform for biomedical image analysis. Mol. Reprod. Dev. 82(7–8), 518–529 (2015)CrossRefGoogle Scholar
  4. 4.
    Margolis, R., Derr, L., Dunn, M., Huerta, M., Larkin, J., Sheehan, J., Mark, G., Green, E.D.: The National Institutes of Health’s Big Data to Knowledge (BD2K) initiative: capitalizing on biomedical big data. J. Am. Med. Inform. Assoc. 21(6), 957–958 (2014)CrossRefGoogle Scholar
  5. 5.
    Luo, J., Wu, M., Gopukumar, D., Zhao, Y.: Big data application in biomedical research and health care: a literature review. Biomed. Inf. Insights 8, 1–10 (2016)Google Scholar
  6. 6.
    Sweeney, C., Liu, L., Arietta, S., Lawrence, J.: HIPI: a Hadoop image processing interface for image-based MapReduce tasks. University of Virginia (2011)Google Scholar
  7. 7.
    Taylor, R.C.: An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. BMC Bioinf. 11(Suppl 12), S1 (2010)MathSciNetCrossRefGoogle Scholar
  8. 8.
    White, T.: Hadoop: The Definitive Guide. O’Reilly Media Inc., Newton (2012)Google Scholar
  9. 9.
    Dean, J., Sanjay, G.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRefGoogle Scholar
  10. 10.
    Vavilapalli, V.K., Murthy, A.C., Douglas, C., Agarwal, S., Konar, M., Evans, R., Graves, T., Lowe, J., Shah, H., Seth, S., Saha, B., Curino, C., O’Malley, O., Radia, S., Reed, B., Baldeschwieler, E.: Apache Hadoop YARN: yet another resource negotiator. In: Proceedings of the 4th Annual Symposium on Cloud Computing (SOCC 2013), Article 5 (2013)Google Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • Francesco Calimeri
    • 1
  • Mirco Caracciolo
    • 1
  • Aldo Marzullo
    • 1
    Email author
  • Claudio Stamile
    • 2
  1. 1.Department of Mathematics and Computer ScienceUniversity of CalabriaRendeItaly
  2. 2.Department of Electrical Engineering (ESAT), STADIUSKatholieke Universiteit LeuvenLeuvenBelgium

Personalised recommendations