Jungle Computing: Distributed Supercomputing Beyond Clusters, Grids, and Clouds

  • Frank J. Seinstra
  • Jason Maassen
  • Rob V. van Nieuwpoort
  • Niels Drost
  • Timo van Kessel
  • Ben van Werkhoven
  • Jacopo Urbani
  • Ceriel Jacobs
  • Thilo Kielmann
  • Henri E. Bal
Part of the Computer Communications and Networks book series (CCN)


In recent years, the application of high-performance and distributed computing in scientific practice has become increasingly wide spread. Among the most widely available platforms to scientists are clusters, grids, and cloud systems. Such infrastructures currently are undergoing revolutionary change due to the integration of many-core technologies, providing orders-of-magnitude speed improvements for selected compute kernels. With high-performance and distributed computing systems thus becoming more heterogeneous and hierarchical, programming complexity is vastly increased. Further complexities arise because urgent desire for scalability and issues including data distribution, software heterogeneity, and ad hoc hardware availability commonly force scientists into simultaneous use of multiple platforms (e.g., clusters, grids, and clouds used concurrently). A true computing jungle .

In this chapter we explore the possibilities of enabling efficient and transparent use of Jungle Computing Systems in everyday scientific practice. To this end, we discuss the fundamental methodologies required for defining programming models that are tailored to the specific needs of scientific researchers. Importantly, we claim that many of these fundamental methodologies already exist today, as integrated in our Ibis high-performance distributed programming system. We also make a case for the urgent need for easy and efficient Jungle Computing in scientific practice, by exploring a set of state-of-the-art application domains. For one of these domains, we present results obtained with Ibis on a real-world Jungle Computing System. The chapter concludes by exploring fundamental research questions to be investigated in the years to come.


Overlay Network Java Virtual Machine Desktop Grid Connectivity Problem Worldwide Scale 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Abramson, D., Sosic, R., Giddy, J., Hall, B.: Nimrod: a tool for performing parameterised simulations using distributed workstations. In: Proceedings of the 4th IEEE International Symposium on High Performance Distributed Computing (HPDC’95), Pentagon City, USA, pp. 112–121 (1995) Google Scholar
  2. 2.
    Anadiotis, G., Kotoulas, S., Oren, E., Siebes, R., van Harmelen, F., Drost, N., Kemp, R., Maassen, J., Seinstra, F., Bal, H.: MaRVIN: a distributed platform for massive RDF inference. In: Semantic Web Challenge 2008, Held in Conjunction with the 7th International Semantic Web Conference (ISWC 2008), Karlsruhe, Germany (2008) Google Scholar
  3. 3.
    Asanovic, K., Bodik, R., Demmel, J., Keaveny, T., Keutzer, K., Kubiatowicz, J., Morgan, N., Patterson, D., Sen, K., Wawrzynek, J., Wessel, D., Yelick, K.: A view of the parallel computing landscape. Commun. ACM 52(10), 56–67 (2009) CrossRefGoogle Scholar
  4. 4.
    Bal, H., Maassen, J., van Nieuwpoort, R., Drost, N., Kemp, R., van Kessel, T., Palmer, N., Wrzesińska, G., Kielmann, T., van Reeuwijk, K., Seinstra, F., Jacobs, C., Verstoep, K.: Real-world distributed computing with ibis. IEEE Comput. 48(8), 54–62 (2010) CrossRefGoogle Scholar
  5. 5.
    Butler, D.: The petaflop challenge. Nature 448, 6–7 (2007) CrossRefGoogle Scholar
  6. 6.
    Carley, K.: Organizational change and the digital economy: a computational organization science perspective. In: Brynjolfsson, E., Kahin, B. (eds.) Understanding the Digital Economy: Data, Tools, Research, pp. 325–351. MIT Press, Cambridge (2000) Google Scholar
  7. 7.
    Carneiro, G., Chan, A., Moreno, P., Vasconcelos, N.: Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 29(3), 394–410 (2007) CrossRefGoogle Scholar
  8. 8.
    Chang, C.I.: Hyperspectral Data Exploitation: Theory and Applications. Wiley, New York (2007) CrossRefGoogle Scholar
  9. 9.
    Kranzlmüller, D.: Towards a sustainable federated grid infrastructure for science. In: Keynote Talk, Sixth High-Performance Grid Computing Workshop (HPGC’08), Rome, Italy (2009) Google Scholar
  10. 10.
    Denis, A., Aumage, O., Hofman, R., Verstoep, K., Kielmann, T., Bal, H.: Wide-area communication for grids: an integrated solution to connectivity, performance and security problems. In: Proceedings of the 13th International Symposium on High Performance Distributed Computing (HPDC’04), Honolulu, HI, USA, pp. 97–106 (2004) Google Scholar
  11. 11.
    Dijkstra, E.: On the Phenomenon of Scientific Disciplines (1986). Unpublished Manuscript EWD988; E.W. Dijkstra Archive Google Scholar
  12. 12.
    Douglas, R., Martin, K.: Neuronal circuits in the neocortex. Annu. Rev. Neurosci. 27, 419–451 (2004) CrossRefGoogle Scholar
  13. 13.
    Drost, N., van Nieuwpoort, R., Maassen, J., Seinstra, F., Bal, H.: JEL: unified resource tracking for parallel and distributed applications. Concurr. Comput. Pract. Exp. (2010). doi: 10.1002/cpe.1592 Google Scholar
  14. 14.
    Editorial: The importance of technological advances. Nature Cell Biology 2, E37 (2000) Google Scholar
  15. 15.
    Editorial: Cloud computing: clash of the clouds. The Economist (2009) Google Scholar
  16. 16.
    Gagliardi, F.: Grid and cloud computing: opportunities and challenges for e-science. In: Keynote Speech, International Symposium on Grid Computing 2008 (ISCG 2008), Taipei, Taiwan (2008) Google Scholar
  17. 17.
    Fensel, D., van Harmelen, F., Andersson, B., Brennan, P., Cunningham, H., Valle, E.D., Fischer, F., Zhisheng, H., Kiryakov, A., Lee, T.I., Schooler, L., Tresp, V., Wesner, S., Witbrock, M., Ning, Z.: Towards LarKC: a platform for web-scale reasoning. In: Proceedings of the Second International Conference on Semantic Computing (ICSC 2008), Santa Clara, CA, USA, pp. 524–529 (2008) Google Scholar
  18. 18.
    Foster, I., Kesselman, C., Tuecke, S.: The anatomy of the grid: enabling scalable virtual organizations. Int. J. High Perform. Comput. Appl. 15(3), 200–222 (2001) CrossRefGoogle Scholar
  19. 19.
    Geusebroek, J., Smeulders, A., Geerts, H.: A minimum cost approach for segmenting networks of lines. Int. J. Comput. Vis. 43(2), 99–111 (2001) CrossRefMATHGoogle Scholar
  20. 20.
    Goetz, A., Vane, G., Solomon, J., Rock, B.: Imaging spectrometry for earth remote sensing. Science 228, 1147–1153 (1985) CrossRefGoogle Scholar
  21. 21.
    Graham-Rowe, D.: Mission to Build a Simulated Brain Begins. New Scientist (2005) Google Scholar
  22. 22.
    Green, R., Eastwood, M., Sarture, C., Chrien, T., Aronsson, M., Chippendale, B., Faust, J., Pavri, B., Chovit, C., Solis, M., Olah, M.: Imaging spectroscopy and the airborne visible/infrared imaging spectrometer (AVIRIS). Remote Sens. Environ. 65(3), 227–248 (1998) CrossRefGoogle Scholar
  23. 23.
    Hendler, J., Shadbolt, N., Hall, W., Berners-Lee, T., Weitzner, D.: Web science: an interdisciplinary approach to understanding the web. Commun. ACM 51(7), 60–69 (2008) CrossRefGoogle Scholar
  24. 24.
    Hey, T.: The social grid. In: Keynote Talk, OGF20 2007, Manchester, UK (2007) Google Scholar
  25. 25.
    Khan, J., Wierzbicki, A.: Guest editor’s introduction; foundation of peer-to-peer computing. Comput. Commun. 31(2), 187–189 (2008) CrossRefGoogle Scholar
  26. 26.
    Koelma, D., Poll, E., Seinstra, F.: Horus C++ reference. Tech. rep., University of Amsterdam, The Netherlands (2002) Google Scholar
  27. 27.
    Koene, R., Tijms, B., van Hees, P., Postma, F., de Ridder, A., Ramakers, G., van Pelt, J., van Ooyen, A.: NETMORPH: a framework for the stochastic generation of large scale neuronal networks with realistic neuron morphologies. Neuroinformatics 7(3), 195–210 (2009) CrossRefGoogle Scholar
  28. 28.
    Lu, P., Oki, H., Frey, C., Chamitoff, G., Chiao, L., Fincke C.M. Foale, E.M. Jr., Tani, D., Whitson, P., Williams, J., Meyer, W., Sicker, R., Au, B., Christiansen, M., Schofield, A., Weitz, D.: Order-of-magnitude performance increases in gpu-accelerated correlation of images from the international space station. J. Real-Time Image Process. (2009) Google Scholar
  29. 29.
    Ludäscher, B., Altintas, I., Berkley, C., Higgins, D., Jaeger, E., Jones, M., Lee, E., Tao, J., Zhao, Y.: Scientific workflow management and the Kepler system. Concurr. Comput. Pract. Exp. 18(10), 1039–1065 (2005) CrossRefGoogle Scholar
  30. 30.
    Maassen, J., Bal, H.: SmartSockets: solving the connectivity problems in grid computing. In: Proceedings of the 16th International Symposium on High Performance Distributed Computing (HPDC’07), Monterey, USA, pp. 1–10 (2007) Google Scholar
  31. 31.
    Manual: Advanced Micro Devices Corporation (AMD). AMD Stream Computing User Guide, Revision 1.1 (2008) Google Scholar
  32. 32.
    Manual: NVIDIA CUDA Complete Unified Device Architecture Programming Guide, v2.0 (2008) Google Scholar
  33. 33.
    Medeiros, R., Cirne, W., Brasileiro, F., Sauvé, J.: Faults in grids: why are they so bad and what can be done about it? In: Proceedings of the 4th International Workshop on Grid Computing, Phoenix, AZ, USA, pp. 18–24 (2003) Google Scholar
  34. 34.
    Morrow, P., Crookes, D., Brown, J., McAleese, G., Roantree, D., Spence, I.: Efficient implementation of a portable parallel programming model for image processing. Concurr. Comput. Pract. Exp. 11, 671–685 (1999) Google Scholar
  35. 35.
    Paz, A., Plaza, A., Plaza, J.: Comparative analysis of different implementations of a parallel algorithm for automatic target detection and classification of hyperspectral images. In: Proceedings of SPIE Optics and Photonics—Satellite Data Compression, Communication, and Processing V, San Diego, CA, USA (2009) Google Scholar
  36. 36.
    Plaza, A.: Recent developments and future directions in parallel processing of remotely sensed hyperspectral images. In: Proceedings of the 6th International Symposium on Image and Signal Processing and Analysis, Salzburg, Austria, pp. 626–631 (2009) Google Scholar
  37. 37.
    Plaza, A., Plaza, J., Paz, A.: Parallel heterogeneous CBIR system for efficient hyperspectral image retrieval using spectral mixture analysis. Concurr. Comput. Pract. Exp. 22(9), 1138–1159 (2010) Google Scholar
  38. 38.
    Plaza, A., Valencia, D., Plaza, J., Martinez, P.: Commodity cluster-based parallel processing of hyperspectral imagery. J. Parallel Distrib. Comput. 66(3), 345–358 (2006) CrossRefMATHGoogle Scholar
  39. 39.
    Rasher, U., Gioli, B., Miglietta, F.: FLEX—fluorescence explorer: a remote sensing approach to quantify spatio-temporal variations of photosynthetic efficiency from space. In: Allen, J., et al. (eds.) Photosynthesis. Energy from the Sun: 14th International Congress on Photosynthesis, pp. 1387–1390. Springer, Berlin (2008) Google Scholar
  40. 40.
    Reilly, M.: When multicore isn’t enough: trends and the future for multi-multicore systems. In: Proceedings of the Twelfth Annual Workshop on High-Performance Embedded Computing (HPEC 2008), Lexington, MA, USA (2008) Google Scholar
  41. 41.
    Seinstra, F., Bal, H., Spoelder, H.: Parallel simulation of ion recombination in nonpolar liquids. Future Gener. Comput. Syst. 13(4–5), 261–268 (1998) CrossRefGoogle Scholar
  42. 42.
    Seinstra, F., Geusebroek, J., Koelma, D., Snoek, C., Worring, M., Smeulders, A.: High-performance distributed video content analysis with parallel-horus. IEEE Trans. Multimed. 14(4), 64–75 (2007) CrossRefGoogle Scholar
  43. 43.
    Seinstra, F., Koelma, D., Bagdanov, A.: Finite state machine-based optimization of data parallel regular domain problems applied in low-level image processing. IEEE Trans. Parallel Distrib. Syst. 15(10), 865–877 (2004) CrossRefGoogle Scholar
  44. 44.
    Seinstra, F., Koelma, D., Geusebroek, J.: A software architecture for user transparent parallel image processing. Parallel Comput. 28(7–8), 967–993 (2002) CrossRefMATHGoogle Scholar
  45. 45.
    Snoek, C., Worring, M., Geusebroek, J., Koelma, D., Seinstra, F., Smeulders, A.: The semantic pathfinder: using an authoring metaphor for generic multimedia indexing. IEEE Trans. Pattern Anal. Mach. Intell. 28(10), 1678–1689 (2006) CrossRefGoogle Scholar
  46. 46.
    Tan, J., Abramson, D., Enticott, C.: Bridging organizational network boundaries on the grid. In: Proceedings of the 6th IEEE International Workshop on Grid Computing, Seattle, WA, USA, pp. 327–332 (2005) Google Scholar
  47. 47.
    Taylor, I., Wang, I., Shields, M., Majithia, S.: Distributed computing with Triana on the grid. Concurr. Comput. Pract. Exp. 17(9), 1197–1214 (2005) CrossRefGoogle Scholar
  48. 48.
    Urbani, J., Kotoulas, S., Maassen, J., Drost, N., Seinstra, F., van Harmelen, F., Bal, H.: WebPIE: a web-scale parallel inference engine. In: Third IEEE International Scalable Computing Challenge (SCALE2010), Held in Conjunction with the 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2010), Melbourne, Australia (2010) Google Scholar
  49. 49.
    van Harmelen, F.: Semantic web technologies as the foundation of the information infrastructure. In: van Oosterom, P., Zlatanove, S. (eds.) Creating Spatial Information Infrastructures: Towards the Spatial Semantic Web. CRC Press, London (2008) Google Scholar
  50. 50.
    van Kessel, T., Drost, N., Seinstra, F.: User transparent task parallel multimedia content analysis. In: Proceedings of the 16th International Euro-Par Conference (Euro-Par 2010), Ischia–Naples, Italy (2010) Google Scholar
  51. 51.
    van Nieuwpoort, R., Kielmann, T., Bal, H.: User-friendly and reliable grid computing based on imperfect middleware. In: Proceedings of the ACM/IEEE International Conference on Supercomputing (SC’07), Reno, NV, USA (2007) Google Scholar
  52. 52.
    van Werkhoven, B., Maassen, J., Seinstra, F.: Towards user transparent parallel multimedia computing on GPU-clusters. In: Proceedings of the 37th ACM IEEE International Symposium on Computer Architecture (ISCA 2010), First Workshop on Applications for Multi and Many Core Processors (A4MMC 2010), Saint Malo, France (2010) Google Scholar
  53. 53.
    Verstoep, K., Maassen, J., Bal, H., Romein, J.: Experiences with fine-grained distributed supercomputing on a 10G testbed. In: Proceedings of the 8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid’08), Lyon, France, pp. 376–383 (2008) Google Scholar
  54. 54.
    Waltz, D., Buchanan, B.: Automating science. Science 324, 43–44 (2009) CrossRefGoogle Scholar
  55. 55.
    Website: EGI—Towards a Sustainable Production Grid Infrastructure. http://www.eu-egi.eu
  56. 56.
    Website: Open European Network for High-Performance Computing on Complex Environments. http://w3.cost.esf.org/index.php?id=177&action_number=IC0805
  57. 57.
  58. 58.
    Website: Top500 Supercomputer Sites. http://www.top500.org; Latest Update (2009)
  59. 59.
    Wojick, D., Warnick, W., Carroll, B., Crowe, J.: The digital road to scientific knowledge diffusion: a faster, better way to scientific progress? D-Lib Mag. 12(6) (2006) Google Scholar
  60. 60.
    Wrzesińska, G., Maassen, J., Bal, H.: Self-adaptive applications on the grid. In: Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’07), San Jose, CA, USA, pp. 121–129 (2007) Google Scholar

Copyright information

© Springer-Verlag London Limited 2011

Authors and Affiliations

  • Frank J. Seinstra
    • 1
  • Jason Maassen
    • 1
  • Rob V. van Nieuwpoort
    • 1
  • Niels Drost
    • 1
  • Timo van Kessel
    • 1
  • Ben van Werkhoven
    • 1
  • Jacopo Urbani
    • 1
  • Ceriel Jacobs
    • 1
  • Thilo Kielmann
    • 1
  • Henri E. Bal
    • 1
  1. 1.Department of Computer ScienceVrije UniversiteitAmsterdamThe Netherlands

Personalised recommendations