Abstract
Application-level interoperability is defined as the ability of an application to utilize multiple distributed heterogeneous resources. Such interoperability is becoming increasingly important with increasing volumes of data, multiple sources of data as well as resource types. The primary aim of this chapter is to understand different ways in which application-level interoperability can be provided across distributed infrastructure. We achieve this by (i) using the canonical wordcount application, based on an enhanced version of MapReduce that scales-out across clusters, clouds, and HPC resources, (ii) establishing how SAGA enables the execution of wordcount application using MapReduce and other programming models such as Sphere concurrently, and (iii) demonstrating the scale-out of ensemble-based biomolecular simulations across multiple resources. We show user-level control of the relative placement of compute and data and also provide simple performance measures and analysis of SAGA–MapReduce when using multiple, different, heterogeneous infrastructures concurrently for the same problem instance. Finally, we discuss Azure and some of the system-level abstractions that it provides and show how it is used to support ensemble-based biomolecular simulations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Jha, S., Merzky, A., Fox, G.: Clouds provide grids with higher levels of abstractions and support for explicit usage modes. Concurr. Comput. Pract. Eng. 21(8), 1087–1108 (2009)
Jha, S., et al.: Design and implementation of network performance aware applications using SAGA and Cactus. In: IEEE Conference on e-Science 2007, Bangalore, pp. 143–150 (2007). ISBN:978-0-7695-3064-2
Jha, S., et al.: Developing adaptive scientific applications with hard to predict runtime resource requirements. In: Proceedings of TeraGrid 2008 Conference (Performance Challenge Award)
SAGA Web-Page. http://saga.cct.lsu.edu
Protocol Buffers. Google’s Data Interchange Format. http://code.google.com/p/protobuf
NIMBUS. http://workspace.globus.org/
Miceli, C., Miceli, M., Rodgriguez-Milla, B., Jha, S.: Understanding performance implications of distributed data for data-intensive applications. Philos. Trans. R. Soc. Lond. Ser. A (2010)
Bégin, M.-E., Grids and clouds—evolution or revolution. https://edms.cern.ch/file/925013/3/EGEE-Grid-Cloud.pdf (2008)
Borthaku, D., The Hadoop distributed file system: architecture and design. Retrieved from http://hadoop.apache.org/common/ (2010)
Casanova, H., Obertelli, G., Berman, F., Wolski, R.: The AppLeS parameter sweep template: User-level middleware for the Grid. Sci. Program. 8(3), 111–126 (2000)
Case, D.A. III, Cheatham, T.E., Darden, T.A., Gohlker, H., Luo, R. Jr., Merz, K.M., Onufriev, A.V., Simmerling, C., Wang, B., Woods, R.: The amber biomolecular simulation programs. J. Comput. Chem. 26, 1668–1688 (2005)
Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. In: OSDI ’06: Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation, p. 15. USENIX Association, Berkeley (2006)
Cloudstore. Cloudstore distributed file system (formerly, Kosmos file system). http://kosmosfs.sourceforge.net/.
DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Swaminathan, S., Vosshall, P., Vogels, W.: Dynamo: Amazon’s highly available key-value store. SIGOPS Oper. Syst. Rev. 41(6), 205–220 (2007)
Deelman, E., Singh, G., Livny, M., Berriman, B., Good, J.: The cost of doing science on the cloud: the Montage example. In: SC ’08: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, pp. 1–12. IEEE Press, New York (2008)
Nurmi, D., et al.: The Eucalyptus open-source cloud-computing system. October 2008
Evangelinos, C., Hill, C.N.: Cloud computing for parallel scientific HPC applications: feasibility of running coupled atmosphere–ocean climate models on Amazon’s EC2. In: Cloud Computing and its Applications (CCA-08) (2008)
Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. ACM SIGOPS Oper. Syst. Rev. 37(5), 43 (2003)
Gu, Y., Grossman, R.L.: Sector and Sphere: the design and implementation of a high-performance data cloud. Philos. Trans. R. Soc. Lond. Ser. A 367, 2429–2445 (2009)
Jha, S., Katz, D.S., Luckow, A., Merzky, A., Stamou, K.: Understanding scientific applications for cloud environments. In: Cloud Computing: Principles and Paradigms. Wiley, New York (2010)
Kaiser, H., Merzky, A., Hirmer, S., Allen, G.: The SAGA C++ reference implementation. In: Object-Oriented Programming, Systems, Languages and Applications (OOPSLA’06)—Library-Centric Software Design (LCSD’06), Portland, OR, USA, 22–26 October 2006
Kim, H., el Khamra, Y., Jha, S., Parashar, M.: Exploring application and infrastructure adaptations on hybrid grid–cloud infrastructure. In: First Workshop on Scientific Cloud Computing (Science Cloud 2010). ACM, New York (2010)
Krishnan, S.: Programming Windows Azure. O’Reilly Media, New York (2010)
Lu, W., Jackson, J., Barga, R.: AzureBlast: A case study of developing science applications on the cloud. In: First Workshop on Scientific Cloud Computing (Science Cloud 2010). ACM, New York (2010)
Luckow, A., Jha, S., Merzky, A., Schnor, B., Kim, J.: Reliable replica exchange molecular dynamics simulation in the Grid using SAGA CPR and Migol. In: Proceedings of UK e-Science 2008 All Hands Meeting, Edinburgh, UK (2008)
Luckow, A., Lacinski, L., Jha, S.: Saga BigJob: an extensible and interoperable pilot-job abstraction for distributed applications and systems. In: The 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (2010)
Merzky, A., Stamou, K., Jha, S.: Application level interoperability between clouds and grids. In: Workshops at the Grid and Pervasive Computing Conference, GPC ’09, May 2009, pp. 143–150 (2009)
Miceli, C., Miceli, M., Jha, S., Kaiser, H., Merzky, A.: Programming abstractions for data intensive computing on clouds and grids. In: 9th IEEE/ACM International Symposium on Cloud, Cluster Computing and the Grid, CCGRID’09, May 2009, pp. 478–483 (2009)
Phillips, J., Braun, R., Wang, W., Gumbart, J., Tajkhorshid, E., Villa, E., Chipot, C., Skeel, R., Kale, L., Schulten, K.: Scalable molecular dynamics with NAMD. J. Comput. Chem. 26, 1781–1802 (2005)
Goodale, T., et al.: A simple API for grid applications (SAGA). http://www.ogf.org/documents/GFD.90.pdf
Acknowledgements
SJ acknowledges UK EPSRC grant number GR/D0766171/1 for supporting SAGA and the e-Science Institute, Edinburgh for the research theme “Distributed Programming Abstractions.” SJ also acknowledges financial support from NSF-Cybertools and NIH-INBRE Grants, while ME acknowledges support from the grant OTKA NK 72845. We also acknowledge internal resources of the Center for Computation & Technology (CCT) at LSU and computer resources provided by LONI/TeraGrid for QueenBee. We thank Chris Miceli, Michael Miceli, Katerina Stamou, Hartmut Kaiser, and Lukasz Lacinski for their collaborative efforts on early parts of this work. We thank Mario Antonioletti and Neil Chue Hong for supporting this work through GSoC-2009 (OMII-UK Mentor Organization).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag London Limited
About this chapter
Cite this chapter
Jha, S., Luckow, A., Merzky, A., Erdely, M., Sehgal, S. (2011). Application-Level Interoperability Across Grids and Clouds. In: Cafaro, M., Aloisio, G. (eds) Grids, Clouds and Virtualization. Computer Communications and Networks. Springer, London. https://doi.org/10.1007/978-0-85729-049-6_9
Download citation
DOI: https://doi.org/10.1007/978-0-85729-049-6_9
Publisher Name: Springer, London
Print ISBN: 978-0-85729-048-9
Online ISBN: 978-0-85729-049-6
eBook Packages: Computer ScienceComputer Science (R0)