Skip to main content

A Large-Scale Elastic Environment for Scientific Computing

  • Conference paper
Software and Data Technologies (ICSOFT 2012)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 411))

Included in the following conference series:

Abstract

The relatively recent introduction of infrastructure-as-a-service (IaaS) clouds, such as Amazon Elastic Compute Cloud (EC2), provide users with the ability to deploy custom software stacks in virtual machines (VMs) across different cloud providers. Users can leverage IaaS clouds to create elastic environments that outsource compute and storage as needed. Additionally, these environments can adapt dynamically to demand, scaling up as demand increases and scaling down as demand decreases. In this paper, we present a large-scale elastic environment that extends cluster resources managers (e.g. Torque) with IaaS resources. Our solution integrates with an open-source elastic manager, the Elastic Processing Unit (EPU), and includes the ability to periodically recontextualize the environment with a light-weight REST-based recontextualization broker. We deploy the Gluster file system to provide a shared file system for all nodes in the environment. Though our implementation currently only supports Torque, we also thoroughly discuss how our architecture can interface with different workflows, including Hadoop’s MapReduce workflows and Condor’s match-making and high-throughput capabilities. For evaluation, we demonstrate the ability to recontextualize 256-node environments within one second of the recontextualization period, scale to over 475 nodes in less than 15 minutes, and support parallel IO from distributed nodes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amazon Web Services, http://aws.amazon.com/

  2. Amazon Web Services CloudWatch, http://aws.amazon.com/cloudwatch/

  3. Apache Hadoop, http://hadoop.apache.org

  4. Armbrust, M., et al.: Above the Clouds: A Berkeley View of Cloud Computing. Technical report, UC-Berkeley (2009)

    Google Scholar 

  5. Armstrong, P., et al.: Cloud Scheduler: A Resource Manager for Distributed Compute Clouds. J. CoRR (2010)

    Google Scholar 

  6. Barham, P., et al.: Xen and the art of virtualization. In: 19th ACM Symposium on Operating System Principles, pp. 164–177. ACM, New York (2003)

    Google Scholar 

  7. Bode, B., et al.: The Portable Batch Scheduler and the Maui Scheduler on Linux Clusters. In: 4th Annual Linux Showcase and Conference, p. 27. USENIX Association, Berkeley (2000)

    Google Scholar 

  8. Ceph, http://ceph.com

  9. EPU, GitHub, https://github.com/ooici/epu

    Google Scholar 

  10. FutureGrid, https://portal.futuregrid.org

  11. Evangelinos, C., Hill, C.: Cloud Computing for Parallel Scientific HPC Applications: Feasibility of Running Coupled Atmosphere-Ocean Climate Models on Amazon’s EC2. In: 1st Workshop on Cloud Computing and its Applications (2008)

    Google Scholar 

  12. Gentzsch, W.: Sun Grid Engine: Towards Creating A Compute Power Grid. In: 1st IEEE/ACM International Symposium on Cluster Computing and the Grid, pp. 35–36. IEEE Computer Society, Washington D.C. (2001)

    Google Scholar 

  13. Ghoshal, D., et al.: I/O Performance of Virtualized Cloud Environments. In: 2nd International Workshop on Data Intensive Computing In The Clouds, pp. 71–80. ACM, New York (2011)

    Google Scholar 

  14. Gluster, http://www.gluster.org

  15. He, Q., et al.: Case Study for Running HPC Applications in Public Clouds. In: 19th ACM International Symposium on High Performance Distributed Computing, pp. 395–401. ACM, New York (2010)

    Chapter  Google Scholar 

  16. Huang, W., et al.: A Case for High Performance Computing with Virtual Machines. In: 20th Annual International Conference on Supercomputing, pp. 125–134. ACM, New York (2006)

    Chapter  Google Scholar 

  17. Jackson, D., et al.: Core Algorithms of the Maui Scheduler. J. Job Sch. Str. for Par. Proc. 2221, 87–102 (2001)

    Google Scholar 

  18. Jackson, K., et al.: Seeking Supernovae in the Clouds: A Performance Study. In: 19th ACM International Symposium on High Performance Distributed Computing, pp. 421–429. ACM, New York (2010)

    Chapter  Google Scholar 

  19. Jacob, A.: Infrastructure in the Cloud Era. In: International OReilly Conference Velocity (2009)

    Google Scholar 

  20. Juve, G., et al.: Data Sharing Options for Scientific Workflows on Amazon EC2. In: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–9. IEEE Computer Society, Washington D.C. (2010)

    Google Scholar 

  21. Juve, G., Deelman, E.: Automating Application Deployment in Infrastructure Clouds. In: IEEE International Conference on Cloud Computing Technology and Science, pp. 658–665 (2011)

    Google Scholar 

  22. Keahey, K., et al.: Virtual Workspaces: Achieving Quality of Service and Quality of Life in the Grid. J. Sci. Pro. 13, 265–276 (2005)

    Google Scholar 

  23. Keahey, K., Freeman, T.: Contextualization: Providing One-Click Virtual Clusters. In: eScience, pp. 301–308 (2008)

    Google Scholar 

  24. Keahey, K., et al.: Infrastructure Outsourcing in Multi-Cloud Environments. In: 2012 Workshop on Cloud Services, Federation, and the 8th Open Cirrus Summit, pp. 33–38. ACM, New York (2012)

    Chapter  Google Scholar 

  25. Marshall, P., Keahey, K., Freeman, T.: Elastic Site: Using Clouds to Elastically Extend Site Resources. In: 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 43–52. IEEE Computer Society, Washington D.C. (2010)

    Google Scholar 

  26. Marshall, P., Tufo, H., Keahey, K.: Provisioning Policies for Elastic Computing Environments. In: 26th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, pp. 1085–1094. IEEE Computer Society, Washington D.C. (2012)

    Google Scholar 

  27. Marshall, P., et al.: Architecting a Large-Scale Elastic Environment: Recontextualization and Adaptive Cloud Services for Scientific Computing. In: 7th International Joint Conference on Software Technologies (2012)

    Google Scholar 

  28. Maui, http://www.clusterresources.com/pages/products/maui-cluster-scheduler.php

  29. Murphy, M., et al.: Dynamic Provisioning of Virtual Organization Clusters. In: 9th IEEE International Symposium on Cluster Computing and the Grid, pp. 364–371. IEEE Computer Society, Washington D.C (2009)

    Google Scholar 

  30. Nimbus, http://www.nimbusproject.org

  31. November 2011 Top500, http://top500.org/list/2011/11/100

  32. OOI EPU, https://confluence.oceanobservatories.org/display/syseng/CIAD+CEI+OV+Elastic+Computing

  33. Opscode, Chef, http://www.opscode.com/chef/

  34. Oracle Grid Engine, http://www.oracle.com/us/products/tools/oracle-grid-engine-075549.html

  35. Ostermann, S., et al.: A Performance Analysis of EC2 Cloud Computing Services for Scientific Computing. In: Avresky, D.R., Diaz, M., Bode, A., Ciciani, B., Dekel, E. (eds.) Cloudc omp 2009. LNICST, vol. 34, pp. 115–131. Springer, Heidelberg (2010)

    Google Scholar 

  36. PBS Python, https://subtrac.sara.nl/oss/pbs_python

  37. Rehr, J., et al.: Scientific Computing in the Cloud. J. Com. in Sci. Eng. 12, 34–43 (2010)

    Article  Google Scholar 

  38. Ruth, P., McGachey, P., Dongyan, X.: VioCluster: Virtualization for Dynamic Computational Domains. In: IEEE Cluster Computing, pp. 1–10. IEEE Computer Society, Washington D.C. (2005)

    Google Scholar 

  39. Ruth, P., et al.: Autonomic Live Adaptation of Virtual Computational Environments In a Multi-Domain Infrastructure. In: IEEE International Conference on Autonomic Computing, pp. 5–14. IEEE Computer Society, Washington D.C. (2006)

    Google Scholar 

  40. Sotomayor, B., et al.: Virtual Infrastructure Management in Private and Hybrid Clouds. J. Int. Comp. 13, 14–22 (2009)

    Google Scholar 

  41. Tannenbaum, T., et al.: Condor: A Distributed Job Scheduler. B. C. Comp. w. Win. 307–350 (2002)

    Google Scholar 

  42. Vinoski, S.: Advanced Message Queuing Protocol. J. Int. Comp. 10, 87–89 (2006)

    Google Scholar 

  43. Wilkening, J., et al.: Using Clouds for Metagenomics: A Case Study. In: Cluster Computing and Workshops, pp. 1–6. IEEE Computer Society, Washington D.C. (2009)

    Google Scholar 

  44. Woitaszek, M., Tufo, H.: Developing a Cloud Computing Charging Model for High-Performance Computing Resources. In: 10th IEEE International Conference on Computer and Information Technology, pp. 210–217. IEEE Computer Society, Washington D.C. (2010)

    Google Scholar 

  45. XtreemFS, http://www.xtreemfs.org

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Marshall, P., Tufo, H., Keahey, K., LaBissoniere, D., Woitaszek, M. (2013). A Large-Scale Elastic Environment for Scientific Computing. In: Cordeiro, J., Hammoudi, S., van Sinderen, M. (eds) Software and Data Technologies. ICSOFT 2012. Communications in Computer and Information Science, vol 411. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45404-2_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-45404-2_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-45403-5

  • Online ISBN: 978-3-642-45404-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics