Advertisement

Application-Level Interoperability Across Grids and Clouds

  • Shantenu Jha
  • Andre Luckow
  • Andre Merzky
  • Miklos Erdely
  • Saurabh Sehgal
Part of the Computer Communications and Networks book series (CCN)

Abstract

Application-level interoperability is defined as the ability of an application to utilize multiple distributed heterogeneous resources. Such interoperability is becoming increasingly important with increasing volumes of data, multiple sources of data as well as resource types. The primary aim of this chapter is to understand different ways in which application-level interoperability can be provided across distributed infrastructure. We achieve this by (i) using the canonical wordcount application, based on an enhanced version of MapReduce that scales-out across clusters, clouds, and HPC resources, (ii) establishing how SAGA enables the execution of wordcount application using MapReduce and other programming models such as Sphere concurrently, and (iii) demonstrating the scale-out of ensemble-based biomolecular simulations across multiple resources. We show user-level control of the relative placement of compute and data and also provide simple performance measures and analysis of SAGA–MapReduce when using multiple, different, heterogeneous infrastructures concurrently for the same problem instance. Finally, we discuss Azure and some of the system-level abstractions that it provides and show how it is used to support ensemble-based biomolecular simulations.

Keywords

Virtual Machine File System Physical Node Chunk Size Distribute File System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgements

SJ acknowledges UK EPSRC grant number GR/D0766171/1 for supporting SAGA and the e-Science Institute, Edinburgh for the research theme “Distributed Programming Abstractions.” SJ also acknowledges financial support from NSF-Cybertools and NIH-INBRE Grants, while ME acknowledges support from the grant OTKA NK 72845. We also acknowledge internal resources of the Center for Computation & Technology (CCT) at LSU and computer resources provided by LONI/TeraGrid for QueenBee. We thank Chris Miceli, Michael Miceli, Katerina Stamou, Hartmut Kaiser, and Lukasz Lacinski for their collaborative efforts on early parts of this work. We thank Mario Antonioletti and Neil Chue Hong for supporting this work through GSoC-2009 (OMII-UK Mentor Organization).

References

  1. 1.
    Jha, S., Merzky, A., Fox, G.: Clouds provide grids with higher levels of abstractions and support for explicit usage modes. Concurr. Comput. Pract. Eng. 21(8), 1087–1108 (2009) CrossRefGoogle Scholar
  2. 2.
    Jha, S., et al.: Design and implementation of network performance aware applications using SAGA and Cactus. In: IEEE Conference on e-Science 2007, Bangalore, pp. 143–150 (2007). ISBN:978-0-7695-3064-2 Google Scholar
  3. 3.
    Jha, S., et al.: Developing adaptive scientific applications with hard to predict runtime resource requirements. In: Proceedings of TeraGrid 2008 Conference (Performance Challenge Award) Google Scholar
  4. 4.
    SAGA Web-Page. http://saga.cct.lsu.edu
  5. 5.
    Protocol Buffers. Google’s Data Interchange Format. http://code.google.com/p/protobuf
  6. 6.
  7. 7.
  8. 8.
  9. 9.
    Miceli, C., Miceli, M., Rodgriguez-Milla, B., Jha, S.: Understanding performance implications of distributed data for data-intensive applications. Philos. Trans. R. Soc. Lond. Ser. A (2010) Google Scholar
  10. 10.
    Bégin, M.-E., Grids and clouds—evolution or revolution. https://edms.cern.ch/file/925013/3/EGEE-Grid-Cloud.pdf (2008)
  11. 11.
    Borthaku, D., The Hadoop distributed file system: architecture and design. Retrieved from http://hadoop.apache.org/common/ (2010)
  12. 12.
    Casanova, H., Obertelli, G., Berman, F., Wolski, R.: The AppLeS parameter sweep template: User-level middleware for the Grid. Sci. Program. 8(3), 111–126 (2000) Google Scholar
  13. 13.
    Case, D.A. III, Cheatham, T.E., Darden, T.A., Gohlker, H., Luo, R. Jr., Merz, K.M., Onufriev, A.V., Simmerling, C., Wang, B., Woods, R.: The amber biomolecular simulation programs. J. Comput. Chem. 26, 1668–1688 (2005) CrossRefGoogle Scholar
  14. 14.
    Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. In: OSDI ’06: Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation, p. 15. USENIX Association, Berkeley (2006) Google Scholar
  15. 15.
    Cloudstore. Cloudstore distributed file system (formerly, Kosmos file system). http://kosmosfs.sourceforge.net/.
  16. 16.
    DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Swaminathan, S., Vosshall, P., Vogels, W.: Dynamo: Amazon’s highly available key-value store. SIGOPS Oper. Syst. Rev. 41(6), 205–220 (2007) CrossRefGoogle Scholar
  17. 17.
    Deelman, E., Singh, G., Livny, M., Berriman, B., Good, J.: The cost of doing science on the cloud: the Montage example. In: SC ’08: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, pp. 1–12. IEEE Press, New York (2008) Google Scholar
  18. 18.
    Nurmi, D., et al.: The Eucalyptus open-source cloud-computing system. October 2008 Google Scholar
  19. 19.
    Evangelinos, C., Hill, C.N.: Cloud computing for parallel scientific HPC applications: feasibility of running coupled atmosphere–ocean climate models on Amazon’s EC2. In: Cloud Computing and its Applications (CCA-08) (2008) Google Scholar
  20. 20.
    Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. ACM SIGOPS Oper. Syst. Rev. 37(5), 43 (2003) CrossRefGoogle Scholar
  21. 21.
    Gu, Y., Grossman, R.L.: Sector and Sphere: the design and implementation of a high-performance data cloud. Philos. Trans. R. Soc. Lond. Ser. A 367, 2429–2445 (2009) CrossRefGoogle Scholar
  22. 22.
    Jha, S., Katz, D.S., Luckow, A., Merzky, A., Stamou, K.: Understanding scientific applications for cloud environments. In: Cloud Computing: Principles and Paradigms. Wiley, New York (2010) Google Scholar
  23. 23.
    Kaiser, H., Merzky, A., Hirmer, S., Allen, G.: The SAGA C++ reference implementation. In: Object-Oriented Programming, Systems, Languages and Applications (OOPSLA’06)—Library-Centric Software Design (LCSD’06), Portland, OR, USA, 22–26 October 2006 Google Scholar
  24. 24.
    Kim, H., el Khamra, Y., Jha, S., Parashar, M.: Exploring application and infrastructure adaptations on hybrid grid–cloud infrastructure. In: First Workshop on Scientific Cloud Computing (Science Cloud 2010). ACM, New York (2010) Google Scholar
  25. 25.
    Krishnan, S.: Programming Windows Azure. O’Reilly Media, New York (2010) Google Scholar
  26. 26.
    Lu, W., Jackson, J., Barga, R.: AzureBlast: A case study of developing science applications on the cloud. In: First Workshop on Scientific Cloud Computing (Science Cloud 2010). ACM, New York (2010) Google Scholar
  27. 27.
    Luckow, A., Jha, S., Merzky, A., Schnor, B., Kim, J.: Reliable replica exchange molecular dynamics simulation in the Grid using SAGA CPR and Migol. In: Proceedings of UK e-Science 2008 All Hands Meeting, Edinburgh, UK (2008) Google Scholar
  28. 28.
    Luckow, A., Lacinski, L., Jha, S.: Saga BigJob: an extensible and interoperable pilot-job abstraction for distributed applications and systems. In: The 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (2010) Google Scholar
  29. 29.
    Merzky, A., Stamou, K., Jha, S.: Application level interoperability between clouds and grids. In: Workshops at the Grid and Pervasive Computing Conference, GPC ’09, May 2009, pp. 143–150 (2009) Google Scholar
  30. 30.
    Miceli, C., Miceli, M., Jha, S., Kaiser, H., Merzky, A.: Programming abstractions for data intensive computing on clouds and grids. In: 9th IEEE/ACM International Symposium on Cloud, Cluster Computing and the Grid, CCGRID’09, May 2009, pp. 478–483 (2009) Google Scholar
  31. 31.
    Phillips, J., Braun, R., Wang, W., Gumbart, J., Tajkhorshid, E., Villa, E., Chipot, C., Skeel, R., Kale, L., Schulten, K.: Scalable molecular dynamics with NAMD. J. Comput. Chem. 26, 1781–1802 (2005) CrossRefGoogle Scholar
  32. 32.
    Goodale, T., et al.: A simple API for grid applications (SAGA). http://www.ogf.org/documents/GFD.90.pdf

Copyright information

© Springer-Verlag London Limited 2011

Authors and Affiliations

  • Shantenu Jha
    • 1
  • Andre Luckow
    • 1
  • Andre Merzky
    • 1
  • Miklos Erdely
    • 2
  • Saurabh Sehgal
    • 1
  1. 1.Louisiana State UniversityBaton RougeUSA
  2. 2.University of PannoniaVeszpremHungary

Personalised recommendations