Skip to main content

Towards Benchmarking IaaS and PaaS Clouds for Graph Analytics

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8991))

Abstract

Cloud computing is a new paradigm for using ICT services—only when needed and for as long as needed, and paying only for service actually consumed. Benchmarking the increasingly many cloud services is crucial for market growth and perceived fairness, and for service design and tuning. In this work, we propose a generic architecture for benchmarking cloud services. Motivated by recent demand for data-intensive ICT services, and in particular by processing of large graphs, we adapt the generic architecture to Graphalytics, a benchmark for distributed and GPU-based graph analytics platforms. Graphalytics focuses on the dependence of performance on the input dataset, on the analytics algorithm, and on the provisioned infrastructure. The benchmark provides components for platform configuration, deployment, and monitoring, and has been tested for a variety of platforms. We also propose a new challenge for the process of benchmarking data-intensive services, namely the inclusion of the data-processing algorithm in the system under test; this increases significantly the relevance of benchmarking results, albeit, at the cost of increased benchmarking duration.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    In inverse chronological order: Lecture at the Fifth Workshop on Big Data Benchmarking (WBDB), Potsdam, Germany, August 2014. Lecture at the Linked Data Benchmark Council’s Fourth TUC Meeting 2014, Amsterdam, May 2014. Lecture at Intel, Haifa, Israel, June 2013. Lecture at IBM Research Labs, Haifa, Israel, May 2013. Lecture at IBM T.J. Watson, Yorktown Heights, NY, USA, May 2013. Lecture at Technion, Haifa, Israel, May 2013. Online lecture for the SPEC Research Group, 2012.

  2. 2.

    http://www.graph500.org.

  3. 3.

    http://iss.ices.utexas.edu/?p=projects/galois/lonestargpu.

References

  1. Lumsdaine, B.H.A., Gregor, D., Berry, J.W.: Challenges in parallel graph processing. Parallel Process. Lett. 17, 5–20 (2007)

    Article  MathSciNet  Google Scholar 

  2. Agarwal, V., Petrini, F., Pasetto, D., Bader, D.A.: Scalable graph exploration on multicore processors. In: SC, pp. 1–11 (2010)

    Google Scholar 

  3. Albayraktaroglu, K., Jaleel, A., Wu, X., Franklin, M., Jacob, B., Tseng, C.-W., Yeung, D.: Biobench: a benchmark suite of bioinformatics applications. In: ISPASS, pp. 2–9. IEEE Computer Society (2005)

    Google Scholar 

  4. Amaral, J.N.: How did this get published? pitfalls in experimental evaluation of computing systems. LTES talk (2012). http://webdocs.cs.ualberta.ca/amaral/Amaral-LCTES2012.pptx. Accessed October 2012

  5. Amazon Web Services. Case studies. Amazon web site, October 2012. http://aws.amazon.com/solutions/case-studies/. Accessed October 2012

  6. Barabási, A.-L., Albert, R.: Emergence of scaling in random networks. Science 286, 509–12 (1999)

    Article  MathSciNet  Google Scholar 

  7. Brebner, P., Cecchet, E., Marguerite, J., Tuma, P., Ciuhandu, O., Dufour, B., Eeckhout, L., Frénot, S., Krishna, A.S., Murphy, J., Verbrugge, C.: Middleware benchmarking: approaches, results, experiences. Concurrency Comput. Pract. Experience 17(15), 1799–1805 (2005)

    Article  Google Scholar 

  8. Buble, A., Bulej, L., Tuma, P.: Corba benchmarking: a course with hidden obstacles. In: IPDPS, p. 279 (2003)

    Google Scholar 

  9. Buluç, A., Duriakova, E., Fox, A., Gilbert, J.R., Kamil, S., Lugowski, A., Oliker, L., Williams, S.: High-productivity and high-performance analysis of filtered semantic graphs. In: IPDPS (2013)

    Google Scholar 

  10. Burtscher, M., Nasre, R., Pingali, K.: A quantitative study of irregular programs on GPUS. In: 2012 IEEE International Symposium on Workload Characterization (IISWC), pp. 141–151. IEEE (2012)

    Google Scholar 

  11. Cai, J., Poon, C.K.: Path-hop: efficiently indexing large graphs for reachability queries. In: CIKM (2010)

    Google Scholar 

  12. Chapin, S.J., Cirne, W., Feitelson, D.G., Jones, J.P., Leutenegger, S.T., Schwiegelshohn, U., Smith, W., Talby, D.: Benchmarks and standards for the evaluation of parallel job schedulers. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 1999, IPPS-WS 1999, and SPDP-WS 1999. LNCS, vol. 1659, pp. 67–90. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  13. Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J.W., Lee, S.H., Skadron, K.: Rodinia: a benchmark suite for heterogeneous computing. In: The 2009 IEEE International Symposium on Workload Characterization, IISWC 2009, pp. 44–54 (2009)

    Google Scholar 

  14. Checconi, F., Petrini, F.: Massive data analytics: the graph 500 on IBM blue Gene/Q. IBM J. Res. Dev. 57(1/2), 10 (2013)

    Article  Google Scholar 

  15. Cong, G., Makarychev, K.: Optimizing large-scale graph analysis on multithreaded, multicore platforms. In: IPDPS (2012)

    Google Scholar 

  16. Deelman, E., Singh, G., Livny, M., Berriman, J.B., Good, J.: The cost of doing science on the cloud: the montage example. In: SC, p. 50. IEEE/ACM (2008)

    Google Scholar 

  17. Downey, A.B., Feitelson, D.G.: The elusive goal of workload characterization. SIGMETRICS Perform. Eval. Rev. 26(4), 14–29 (1999)

    Article  Google Scholar 

  18. Eeckhout, L., Nussbaum, S., Smith, J.E., Bosschere, K.D.: Statistical simulation: adding efficiency to the computer designer’s toolbox. IEEE Micro 23(5), 26–38 (2003)

    Article  Google Scholar 

  19. Folkerts, E., Alexandrov, A., Sachs, K., Iosup, A., Markl, V., Tosun, C.: Benchmarking in the cloud: what it should, can, and cannot be. In: Nambiar, R., Poess, M. (eds.) TPCTC 2012. LNCS, vol. 7755, pp. 173–188. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  20. Frachtenberg, E., Feitelson, D.G.: Pitfalls in parallel job scheduling evaluation. In: Feitelson, D.G., Frachtenberg, E., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2005. LNCS, vol. 3834, pp. 257–282. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  21. Genbrugge, D., Eeckhout, L.: Chip multiprocessor design space exploration through statistical simulation. IEEE Trans. Comput. 58(12), 1668–1681 (2009)

    Article  MathSciNet  Google Scholar 

  22. Georges, A., Buytaert, D., Eeckhout, L.: Statistically rigorous java performance evaluation. In: OOPSLA, pp. 57–76 (2007)

    Google Scholar 

  23. Graph500 consortium. Graph 500 benchmark specification. Graph500 documentation, September 2011. http://www.graph500.org/specifications

  24. Gray, J. (ed.): The Benchmark Handbook for Database and Transasction Systems. Mergan Kaufmann, San Mateo (1993)

    Google Scholar 

  25. Guo, Y., Biczak, M., Varbanescu, A.L., Iosup, A., Martella, C., Willke, T.L.: How well do graph-processing platforms perform? an empirical performance evaluation and analysis. In: IPDPS (2014)

    Google Scholar 

  26. Guo, Y., Iosup, A.: The game trace archive. In: NETGAMES, pp. 1–6 (2012)

    Google Scholar 

  27. Guo, Y., Varbanescu, A.L., Iosup, A., Martella, C., Willke, T.L.: Benchmarking graph-processing platforms: a vision. In: ICPE, pp. 289–292 (2014)

    Google Scholar 

  28. Han, M., Daudjee, K., Ammar, K., Özsu, M.T., Wang, X., Jin, T.: An experimental comparison of pregel-like graph processing systems. PVLDB 7(12), 1047–1058 (2014)

    Google Scholar 

  29. Iosup, A.: Iaas cloud benchmarking: approaches, challenges, and experience. In: HotTopiCS, pp. 1–2 (2013)

    Google Scholar 

  30. Iosup, A., Epema, D.H.J.: GrenchMark: a framework for analyzing, testing, and comparing grids. In: CCGrid, pp. 313–320 (2006)

    Google Scholar 

  31. Iosup, A., Epema, D.H.J., Franke, C., Papaspyrou, A., Schley, L., Song, B., Yahyapour, R.: On grid performance evaluation using synthetic workloads. In: Frachtenberg, E., Schwiegelshohn, U. (eds.) JSSPP 2006. LNCS, vol. 4376, pp. 232–255. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  32. Iosup, A., Ostermann, S., Yigitbasi, N., Prodan, R., Fahringer, T., Epema, D.H.J.: Performance analysis of cloud computing services for many-tasks scientific computing. IEEE Trans. Par. Dist. Syst. 22(6), 931–945 (2011)

    Article  Google Scholar 

  33. Iosup, A., Prodan, R., Epema, D.: Iaas cloud benchmarking: approaches, challenges, and experience. In: Li, X., Qiu, J. (eds.) Cloud Computing for Data-Intensive Applications. Springer, New York (2015)

    Google Scholar 

  34. Iosup, A., Prodan, R., Epema, D.H.J.: Iaas cloud benchmarking: approaches, challenges, and experience. In: SC Companion/MTAGS (2012)

    Google Scholar 

  35. Jackson, K.R., Muriki, K., Ramakrishnan, L., Runge, K.J., Thomas, R.C.: Performance and cost analysis of the supernova factory on the amazon aws cloud. Sci. Program. 19(2–3), 107–119 (2011)

    Google Scholar 

  36. Jain, R. (ed.): The Art of Computer Systems Performance Analysis. Wiley, New York (1991)

    MATH  Google Scholar 

  37. Jiang, W., Agrawal, G.: Ex-MATE: data intensive computing with large reduction objects and its application to graph mining. In: CCGRID (2011)

    Google Scholar 

  38. Katz, G.J., Kider Jr., J.T.: All-pairs shortest-paths for large graphs on the GPU. In: 23rd ACM SIGGRAPH/EUROGRAPHICS Symposium on Graphics Hardware, pp. 47–55 (2008)

    Google Scholar 

  39. LDBC consortium. Social network benchmark: Data generator. LDBC Deliverable 2.2.2, September 2013. http://ldbc.eu/sites/default/files/D2.2.2_final.pdf

  40. Leskovec, J.: Stanford Network Analysis Platform (SNAP). Stanford University, California (2006)

    Google Scholar 

  41. Leskovec, J., Kleinberg, J.M., Faloutsos, C.: Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, Illinois, USA, pp. 177–187, 21–24 August 2005

    Google Scholar 

  42. Lu, Y., Cheng, J., Yan, D., Wu, H.: Large-scale distributed graph computing systems: an experimental evaluation. PVLDB 8(3), 281–292 (2014)

    Google Scholar 

  43. Mell, P., Grance, T.: The NIST definition of cloud computing. National Institute of Standards and Technology (NIST) Special Publication 800–145, September 2011. http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf. Accessed October 2012

  44. de Laat, C., Verstraaten, M., Varbanescu, A.L.: State-of-the-art in graph traversals on modern arhictectures. Technical report, University of Amsterdam, August 2014

    Google Scholar 

  45. Merrill, D., Garland, M., Grimshaw, A.: Scalable GPU graph traversal. SIGPLAN Not. 47(8), 117–128 (2012)

    Article  Google Scholar 

  46. Nasre, R., Burtscher, M., Pingali, K.: Data-driven versus topology-driven irregular computations on GPUs. In: 2013 IEEE 27th International Symposium on Parallel & Distributed Processing (IPDPS), pp. 463–474. IEEE (2013)

    Google Scholar 

  47. Oskin, M., Chong, F.T., Farrens, M.K.: Hls: combining statistical and symbolic simulation to guide microprocessor designs. In: ISCA, pp. 71–82 (2000)

    Google Scholar 

  48. Penders, A.: Accelerating graph analysis with heterogeneous systems. Master’s thesis, PDS, EWI, TUDelft, December 2012

    Google Scholar 

  49. Pingali, K., Nguyen, D., Kulkarni, M., Burtscher, M., Hassaan, M.A., Kaleem, R., Lee, T.-H., Lenharth, A., Manevich, R., Méndez-Lojo, M., et al.: The tao of parallelism in algorithms. ACM SIGPLAN Not. 46(6), 12–25 (2011)

    Article  Google Scholar 

  50. Que, X., Checconi, F., Petrini, F.: Performance analysis of graph algorithms on P7IH. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds.) ISC 2014. LNCS, vol. 8488, pp. 109–123. Springer, Heidelberg (2014)

    Google Scholar 

  51. Raicu, I., Zhang, Z., Wilde, M., Foster, I.T., Beckman, P.H., Iskra, K., Clifford, B.: Toward loosely coupled programming on petascale systems. In: SC, p. 22. ACM (2008)

    Google Scholar 

  52. Hong, T.O.S., Kim, S.K., Olukotun, K.: Accelerating CUDA graph algorithms at maximum warp. In: Principles and Practice of Parallel Programming, PPoPP 2011 (2011)

    Google Scholar 

  53. Saavedra, R.H., Smith, A.J.: Analysis of benchmark characteristics and benchmark performance prediction. ACM Trans. Comput. Syst. 14(4), 344–384 (1996)

    Article  Google Scholar 

  54. Schroeder, B., Wierman, A., Harchol-Balter, M.: Open versus closed: a cautionary tale. In: NSDI (2006)

    Google Scholar 

  55. Sharkawi, S., DeSota, D., Panda, R., Indukuru, R., Stevens, S., Taylor, V.E., Wu, X.: Performance projection of HPC applications using SPEC CFP2006 benchmarks. In: IPDPS, pp. 1–12 (2009)

    Google Scholar 

  56. Shun, J., Blelloch, G.E.: Ligra: a lightweight graph processing framework for shared memory. In: PPOPP (2013)

    Google Scholar 

  57. Spacco, J., Pugh, W.: Rubis revisited: why J2EE benchmarking is hard. Stud. Inform. Univ. 4(1), 25–30 (2005)

    Google Scholar 

  58. Varbanescu, A.L., Verstraaten, M., de Laat, C., Penders, A., Iosup, A., Sips, H.: Can portability improve performance? an empirical study of parallel graph analytics. In: ICPE (2015)

    Google Scholar 

  59. Villegas, D., Antoniou, A., Sadjadi, S.M., Iosup, A.: An analysis of provisioning and allocation policies for infrastructure-as-a-service clouds. In: 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGrid 2012, pp. 612–619, Ottawa, Canada, 13–16 May 2012

    Google Scholar 

  60. Wang, N., Zhang, J., Tan, K.-L., Tung, A.K.H.: On triangulation-based dense neighborhood graphs discovery. VLDB 4(2), 58–68 (2010)

    Google Scholar 

  61. Yigitbasi, N., Iosup, A., Epema, D.H.J., Ostermann, S.: C-meter: a framework for performance analysis of computing clouds. In: 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, CCGrid 2009, Shanghai, China, pp. 472–477, 18–21 May 2009

    Google Scholar 

Download references

Acknowledgments

This work is supported by the Dutch STW/NOW Veni personal grants @large (#11881) and Graphitti (#12480), by the EU FP7 project PEDCA, by the Dutch national program COMMIT and its funded project COMMissioner, and by the Dutch KIEM project KIESA. The authors wish to thank Hassan Chafi and the Oracle Research Labs, Peter Boncz and the LDBC project, and Josep Larriba-Pey and Arnau Prat Perez, whose support has made the Graphalytics benchmark possible; and to Tilmann Rabl, for facilitating this material.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexandru Iosup .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Iosup, A. et al. (2015). Towards Benchmarking IaaS and PaaS Clouds for Graph Analytics. In: Rabl, T., Sachs, K., Poess, M., Baru, C., Jacobson, HA. (eds) Big Data Benchmarking. WBDB 2014. Lecture Notes in Computer Science(), vol 8991. Springer, Cham. https://doi.org/10.1007/978-3-319-20233-4_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-20233-4_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-20232-7

  • Online ISBN: 978-3-319-20233-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics