Skip to main content

The FutureGrid Testbed for Big Data

  • Chapter
  • First Online:
  • 1527 Accesses

Abstract

In this chapter introduce you to FutureGrid, which provides a testbed to conduct research for Cloud, Grid, and High Performance Computing. Although FutureGrid has only a modest number of compute cores (about 4,500 regular cores and 14,000 GPU cores) it provides an ideal playground to test out various frameworks that may be useful for users to consider as part of their big data analysis pipelines. We focus here on the use of FutureGrid for big data related testbed research. The chapter is structured as follows. First we provide the reader with an introduction to FutureGrid hardware (Sect. 2). Next we focus on a selected number of services and tools that have been proven to be useful to conduct big data research on FutureGrid (Sect. 3). We contrast frameworks such as MPI, virtual large memory systems, Infrastructure as a Service and map/reduce frameworks. Next we present reasoning by analyzing requests to use certain technologies and identify trends within the user community to direct effort in FutureGrid (Sect. 4). The next section reports on our experience with the integration of our software and systems teams via DevOps (Sect. 5). Next we summarize Cloudmesh, which is a logical continuation of the FutureGrid architecture. It provides abilities to federate cloud services and to conduct cloudshifting; that is to assign servers on-demand to HPC and Cloud services (Sect. 6). We conclude the chapter with a brief summary (Sect. 6).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   179.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. “Apache Hadoop Project.” [Online]. Available: http://hadoop.apache.org

  2. “Gartner’s 2013 hype cycle for emerging technologies maps out evolving relationship between humans and machines,” Press Release. [Online]. Available: http://www.gartner.com/newsroom/id/2575515

  3. “Jira ticket system,” Web Page. [Online]. Available: https://confluence.atlassian.com/display/JIRAKB/Using+JIRA+for+Helpdesk+or+Support

  4. “Map reduce,” Wikepedia. [Online]. Available: http://en.wikipedia.org/wiki/MapReduce

  5. “Rt: Request tracker,” Web Page. [Online]. Available: http://www.bestpractical.com/rt/

  6. “Twister: Iterative mapreduce,” Web Page. [Online]. Available: http://www.iterativemapreduce.org

  7. W. Barth, Nagios. System and Network Monitoring, u.s. ed ed. No Starch Press, 2006. [Online]. Available: http://www.amazon.de/gp/redirect.html%3FASIN=1593270704%26tag=ws%26lcode=xm2%26cID=2025%26ccmID=165953%26location=/o/ASIN/1593270704%253FSubscriptionId=13CT5CVB80YFWJEPWS02

  8. J. Dean and S. Ghemawat, “Mapreduce: Simplified data processing on large clusters,” Commun. ACM, vol. 51, no. 1, pp. 107–113, Jan. 2008. [Online]. Available: http://doi.acm.org/10.1145/1327452.1327492

  9. J. Diaz, G. von Laszewski, F. Wang, and G. C. Fox, “Abstract Image Management and Universal Image Registration for Cloud and HPC Infrastructures,” in IEEE Cloud 2012, Honolulu, Jun. 2012. [Online]. Available: http://cyberaide.googlecode.com/svn/trunk/papers/12-cloud12-imagemanagement/vonLaszewski-12-IEEECloud2012.pdf

  10. J. Diaz, G. von Laszewski, F. Wang, A. J. Younge, and G. C. Fox, “FutureGrid Image Repository: A Generic Catalog and Storage System for Heterogeneous Virtual Machine Images,” in Third IEEE International Conference on Coud Computing Technology and Science (CloudCom2011), Athens, Greece, 12/2011 2011, paper, pp. 560–564. [Online]. Available: http://cyberaide.googlecode.com/svn/trunk/papers/11-cloudcom11-imagerepo/vonLaszewski-draft-11-imagerepo.pdf

  11. G. C. Fox, G. von Laszewski, J. Diaz, K. Keahey, J. Fortes, R. Figueiredo, S. Smallen, W. Smith, and A. Grimshaw, Contemporary HPC Architectures, draft ed., 2012, ch. FutureGrid - a reconfigurable testbed for Cloud, HPC and Grid Computing. [Online]. Available: http://cyberaide.googlecode.com/svn/trunk/papers/pdf/vonLaszewski-12-fg-bookchapter.pdf

  12. Gregor, “Cloudmesh on Github,” Web Page. [Online]. Available: http://cloudmesh.github.io/cloudmesh/

  13. S. Krishnan and G. von Laszewski, “Using hadoop on futuregrid,” Web Page, Manual, 2013. [Online]. Available: http://futuregrid.github.io/manual/hadoop.html

  14. “The Network Impairments device is Spirent XGEM,” 2012. [Online]. Available: http://www.spirent.com/Solutions-Directory/ImpairmentsGEM.aspx?oldtab=0&oldpg0=2

  15. S. Krishnan, M. Tatineni, and C. Baru, “myHadoop - Hadoop-on-Demand on Traditional HPC Resources,” Tech. Rep., 2011. [Online]. Available: http://www.sdsc.edu/~allans/MyHadoop.pdf

  16. G. K. Lockwood, “myhadoop 2.” [Online]. Available: https://github.com/glennklockwood/myhadoop

  17. M. L. Massie, B. N. Chun, and D. E. Culler, “The Ganglia Distributed Monitoring System: Design, Implementation, and Experience,” in Journal of Parallel Computing, April 2004.

    Google Scholar 

  18. J. Qiu, “Course: Fall 2013 P434 Distributed Systems Undergraduate Course.” [Online]. Available: https://portal.futuregrid.org/projects/368

  19. M. L. Massie, B. N. Chun, and D. E. Culler, “Spring 2014 CSCI-B649 Cloud Computing MOOC for residential and online students.” [Online]. Available: https://portal.futuregrid.org/projects/405

  20. L. Ramakrishnan, “FRIEDA: Flexible Robust Intelligent Elastic Data Management.” [Online]. Available: https://portal.futuregrid.org/projects/298

  21. “ScaleMP,” 2012. [Online]. Available: http://www.scalemp.com/

  22. S. Smallen, K. Ericson, J. Hayes, and C. Olschanowsky, “User-level grid monitoring with inca 2,” in Proceedings of the 2007 workshop on Grid monitoring, ser. GMW ’07. New York, NY, USA: ACM, 2007, pp. 29–38. [Online]. Available: http://doi.acm.org/10.1145/1272680.1272687

  23. G. von Laszewski, “Cmd3,” Github Documentation and Code. [Online]. Available: http://cloudmesh.futuregrid.org/cmd3/

  24. M. L. Massie, B. N. Chun, and D. E. Culler, “Workflow Concepts of the Java CoG Kit,” Journal of Grid Computing, vol. 3, pp. 239–258, Jan. 2005. [Online]. Available: http://cyberaide.googlecode.com/svn/trunk/papers/anl/vonLaszewski-workflow-taylor-anl.pdf

  25. G. von Laszewski, J. Diaz, F. Wang, and G. C. Fox, “Comparison of Multiple Cloud Frameworks,” in IEEE Cloud 2012, Honolulu, HI, Jun. 2012. [Online]. Available: http://cyberaide.googlecode.com/svn/trunk/papers/12-cloud12-cloudcompare/laszewski-IEEECloud2012_id-4803.pdf

  26. G. von Laszewski, I. Foster, J. Gawor, and P. Lane, “A Java Commodity Grid Kit,” Concurrency and Computation: Practice and Experience, vol. 13, no. 8–9, pp. 645–662, 2001. [Online]. Available: http://cyberaide.googlecode.com/svn/trunk/papers/anl/vonLaszewski-cog-cpe-final.pdf

  27. G. von Laszewski, G. C. Fox, F. Wang, A. J. Younge, Kulshrestha, G. G. Pike, W. Smith, J. Voeckler, R. J. Figueiredo, J. Fortes, K. Keahey, and E. Deelman, “Design of the FutureGrid Experiment Management Framework,” in Proceedings of Gateway Computing Environments 2010 (GCE2010) at SC10. New Orleans, LA: IEEE, Nov. 2010. [Online]. Available: http://cyberaide.googlecode.com/svn/trunk/papers/10-FG-exp-GCE10/vonLaszewski-10-FG-exp-GCE10.pdf

  28. G. von Laszewski, M. Hategan, and D. Kodeboyina, Workflows for E-science: Scientific Workflows for Grids. Secaucus, NJ, USA: Springer-Verlag New York, Inc., 2007, ch. Grid Workflow with the Java CoG Kit. [Online]. Available: http://cyberaide.googlecode.com/svn/trunk/papers/anl/vonLaszewski-workflow-book.pdf

  29. G. von Laszewski, H. Lee, J. Diaz, F. Wang, K. Tanaka, S. Karavinkoppa, G. C. Fox, and T. Furlani, “Design of an Accounting and Metric-based Cloud-shifting and Cloud-seeding Framework for Federated Clouds and Bare-metal Environments,” in Proceedings of the 2012 Workshop on Cloud Services, Federation, and the 8th Open Cirrus Summit, ser. FederatedClouds ’12. New York, NY, USA: ACM, 2012, pp. 25–32.

    Google Scholar 

  30. G. von Laszewski and X. Yang, “Virtual cluster with slurm,” Github repository. [Online]. Available: https://github.com/cloudmesh/cluster

  31. A. J. Younge, R. Henschel, J. T. Brown, G. von Laszewski, J. Qiu, and G. C. Fox, “Analysis of Virtualization Technologies for High Performance Computing Environments,” in Proceedings of the 4th International Conference on Cloud Computing (CLOUD 2011). Washington, DC: IEEE, July 2011, pp. 9–16. [Online]. Available: http://cyberaide.googlecode.com/svn/trunk/papers/10-fg-hypervisor/10-fg-hypervisor.pdf

Download references

Acknowledgements

Some of the text published in this chapter is available form the FutureGrid portal. The FutureGrid project is funded by the National Science Foundation (NSF) and is led by Indiana University with University of Chicago, University of Florida, San Diego Supercomputing Center, Texas Advanced Computing Center, University of Virginia, University of Tennessee, University of Southern California, Dresden, Purdue University, and Grid 5000 as partner sites. This material is based upon work supported in part by the National Science Foundation under Grant No. 0910812 [11]. If you use FutureGrid and produce a paper or presentation, we ask you to include the references [11, 27] as well as this chapter. We like to thank Fugang Wang for the development of the framework that allowed us to produce the statistical data and Hyungro Lee for assisting in the creation of the data tables that lead to the creation of Figs. 11 and 12. Furthermore we like to thank Barbara O’Leary for proofreading this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gregor von Laszewski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this chapter

Cite this chapter

von Laszewski, G., Fox, G.C. (2014). The FutureGrid Testbed for Big Data. In: Li, X., Qiu, J. (eds) Cloud Computing for Data-Intensive Applications. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-1905-5_2

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-1905-5_2

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4939-1904-8

  • Online ISBN: 978-1-4939-1905-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics