Skip to main content

Bringing High Performance Computing to Big Data Algorithms

  • Chapter
  • First Online:

Abstract

Many ideas of High Performance Computing are applicable to Big Data problems. The more so now, that hybrid, GPU computing gains traction in mainstream computing applications. This work discusses the differences between the High Performance Computing software stack and the Big Data software stack and then focuses on two popular computing workloads, the Alternating Least Squares algorithm and the Singular Value Decomposition, and shows how their performance can be maximized using hybrid computing techniques.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   349.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   449.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   449.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://scgroup20.ceid.upatras.gr:8000/tmg.

  2. 2.

    http://ir.dcs.gla.ac.uk/resources.

  3. 3.

    http://hapmap.ncbi.nlm.nih.gov.

  4. 4.

    http://tsubame.gsic.titech.ac.jp.

References

  1. Apache, Mahout version 0.9 (2015a). https://mahout.apache.org/

  2. Apache, Spark version 1.5 (2015b). http://spark.apache.org/

  3. J. Baglama, L. Reichel, Augmented implicitly restarted Lanczos bidiagonalization methods. SIAM J. Sci. Comput. 27, 19–42 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  4. J. Bennett, S. Lanning, The netflix prize, in Proceedings of the KDD Cup Workshop 2007 (ACM, New York, 2007), pp 3–6. http://www.cs.uic.edu/~liub/KDD-cup-2007/NetflixPrize-description.pdf

  5. M.W. Berry, Large scale sparse singular value computations. Int. J. Supercomput. Appl. 6, 13–49 (1992)

    Google Scholar 

  6. T. Bertin-Mahieux, D.P. Ellis, B. Whitman, P. Lamere, The million song dataset, in Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR) (2011)

    Google Scholar 

  7. C. Bishop, Pattern Recognition and Machine Learning (Springer, New York, 2006)

    MATH  Google Scholar 

  8. P. Biswas, T.C. Lian, T.C. Wang, Y. Ye, Semidefinite programming based algorithms for sensor network localization. ACM Trans. Sensor Networks (TOSN) 2(2), 188–220 (2006)

    Article  Google Scholar 

  9. E.J. Candès, B. Recht, Exact matrix completion via convex optimization. Found. Comput. Math. 9(6), 717–772 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  10. P. Chen, D. Suter, Recovering the missing components in a large noisy low-rank matrix: application to SFM. IEEE Trans. Pattern Anal. Mach. Intell. 26(8), 1051–1063 (2004)

    Article  Google Scholar 

  11. Committee on the Analysis of Massive Data, Committee on Applied and Theoretical Statistics, Board on Mathematical Sciences and Their Applications, Division on Engineering and Physical Sciences, National Research Council (2013). Frontiers in Massive Data Analysis. The National Academies Press

    Google Scholar 

  12. Dato, GraphLab version 1.3 (2015). https://dato.com/products/create/open_source.html

  13. S. Deerwester, S. Dumais, G. Furnas, T. Landauer, R. Harshman, Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41, 391–407 (1990)

    Article  Google Scholar 

  14. DOE Office of Science, Synergistic challenges in data-intensive science and exascale computing. DOE Advanced Scientific Computing Advisory Committee (ASCAC) (2013). Data Subcommittee Report

    Google Scholar 

  15. S.H. Fuller, L.I. Millett, The Future of Computing Performance: Game Over Or Next Level? (National Academy Press, Washington, DC, 2011)

    Google Scholar 

  16. M. Gates, H. Anzt, J. Kurzak, J. Dongarra, Accelerating collaborative filtering using concepts from high performance computing, in 2015 IEEE International Conference on Big Data (Big Data) (IEEE, 2015), pp. 667–676

    Google Scholar 

  17. D. Goldberg, D. Nichols, B.M. Oki, D. Terry, Using collaborative filtering to weave an information tapestry. Commun. ACM 35(12), 61–70 (1992)

    Article  Google Scholar 

  18. G. Golub, C. van Loan, Matrix Computations, 4th edn. (The Johns Hopkins University Press, Baltimore, 2012)

    MATH  Google Scholar 

  19. G. Golub, F. Luk, M. Overton, A block Lanczos method for computing the singular values and corresponding singular vectors of a matrix. ACM Trans. Math. Softw. 7, 149–169 (1981)

    Article  MATH  Google Scholar 

  20. S. Graham, M. Snir, C. Patterson, Getting Up to Speed: The Future of Supercomputing (The National Academies Press, Washington, DC, 2004)

    Google Scholar 

  21. N. Halko, P. Martinsson, J. Tropp, Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. 53(2), 217–288 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  22. M. Hoemmen, Communication-avoiding Krylov subspace methods. Ph.D. thesis, University of California, Berkeley (2010)

    Google Scholar 

  23. Y. Hu, Y. Koren, C. Volinsky, Collaborative filtering for implicit feedback datasets, in IEEE International Conference on Data Mining (ICDM) (2008), pp. 263–272

    Google Scholar 

  24. Innovative Computing Lab, BEAST (2015). http://icl.utk.edu/beast/

  25. Intel Corp, Developer Reference for Intel Math Kernel Library (2015). https://software.intel.com/en-us/articles/mkl-reference-manual

  26. Intel Corp, Intel Data Analytics Acceleration Library 2016, Developer Guide (2016)

    Google Scholar 

  27. P. Jain, P. Netrapalli, S. Sanghavi, Low-rank matrix completion using alternating minimization, in Proceedings of the Forty-Fifth annual ACM Symposium on Theory of Computing (ACM, 2013), pp 665–674

    Google Scholar 

  28. I. Karasalo, Estimating the covariance matrix by signal subspace averaging. IEEE Trans. Acoust. Speech Signal Process. 34(1), 8–12 (1986)

    Article  MathSciNet  Google Scholar 

  29. T. Kolda, D. O’Leary, A semidiscrete matrix decomposition for latent semantic indexing information retrieval. ACM Trans. Inf. Syst. 16(4), 322–346 (1998)

    Article  Google Scholar 

  30. Y. Koren, Factorization meets the neighborhood: a multifaceted collaborative filtering model, in Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’08 (ACM, New York, 2008), pp. 426–434

    Google Scholar 

  31. R. Krovetz, W.B. Croft, Lexical ambiguity and information retrieval. ACM Trans. Inf. Syst. 10(2), 115–141 (1992)

    Article  Google Scholar 

  32. J. Kurzak, S. Tomov, J. Dongarra, Autotuning gemm kernels for the Fermi GPU. IEEE Trans. Parallel Distrib. Syst. 23(11), 2045–2057 (2012)

    Article  Google Scholar 

  33. J. Kurzak, H. Anzt, M. Gates, J. Dongarra, Implementation and tuning of batched Cholesky factorization and solve for NVIDIA GPUs. Trans. Parallel Distrib. Syst. (2015). doi:10.1109/TPDS.2015.2481890

  34. C. Lam, Hadoop in Action (Manning Publications Co., Stamford, 2010)

    Google Scholar 

  35. D. Laney, 3D data management: controlling data volume, velocity, and variety. Application Delivery Strategies by META Group Inc., File: 949 (2001)

    Google Scholar 

  36. E. Liberty, F. Woolfe, P.G. Martinsson, V. Rokhlin, M. Tygert, Randomized algorithms for the low-rank approximation of matrices. Proc. National Acad. Sci. 104(51), 20167–20172 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  37. Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, j.M. Hellerstein, GraphLab: a new framework for parallel machine learning. CoRR abs/1006.4990 (2010). http://arxiv.org/abs/1006.4990

  38. Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, J.M. Hellerstein, Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5(8), 716–727 (2012)

    Article  Google Scholar 

  39. P. Luszczek, M. Gates, J. Kurzak, A. Danalis, J. Dongarra, Search space generation and pruning system for autotuners, in International Workshop on Automatic Performance Tuning (iWAPT 2016) (2016, submitted)

    Google Scholar 

  40. D. Lyubimov, Command line interface, stochastic SVD. Technical report, The Apache Software Foundation (2014). https://mahout.apache.org/users/dim-reduction/ssvd.page/SSVD-CLI.pdf

  41. M.W. Mahoney, Randomized algorithms for matrices and data. Found. Trends\(\textregistered \) Mach. Learn. 3(2), 123–224 (2011)

    Google Scholar 

  42. P.G. Martinsson, V. Rockhlin, M. Tygert, A randomized algorithm for the approximation of matrices. Technical report, DTIC Document (2006)

    Google Scholar 

  43. P. McJones, Eachmovie collaborative filtering data set. DEC Systems Research Center 249 (1997)

    Google Scholar 

  44. X. Meng, J. Bradley, B. Yavuz, E. Sparks, S. Venkataraman, D. Liu, J. Freeman, D. Tsai, M. Amde, S. Owen et al., MLlib: Machine learning in Apache Spark (2015). arXiv preprint arXiv:150506807

  45. P. Menozzi, A. Piazza, L. C-Sforza, Synthetic maps of human gene frequencies in Europeans. Science 201, 786–792 (1978)

    Article  Google Scholar 

  46. NVIDIA Corp, cuBLAS Library User Guide, v7.0 (2015a)

    Google Scholar 

  47. NVIDIA Corp, CUDA C Programming Guide, v7.0 (2015b)

    Google Scholar 

  48. S. Owen, R. Anil, T. Dunning, E. Friedman, Mahout in Action (Manning Publications Co., Greenwich, 2011)

    Google Scholar 

  49. P. Paschou, E. Ziv, E. Burchard, S. Choudhry, W. R-Cintron, M. Mahoney, P. Drineas, PCA-correlated SNPs for structure identification in worldwide human populations. PLoS Genet. 3, 1672–1686 (2007)

    Article  Google Scholar 

  50. A. Paterek, Improving regularized singular value decomposition for collaborative filtering, in Proceedings of KDD Cup and Workshop (2007), pp. 39–42

    Google Scholar 

  51. N. Patterson, A. Price, D. Reich, Population structure and eigenanalysis. PLoS Genet. 2(12), 2074–2093 (2006)

    Article  Google Scholar 

  52. A. Price, N. Patterson, R. Plenge, M. Weinblatt, N. Shadick, D. Reich, Principal components analysis corrects for stratification in genome-wide association studies. Nature Genet. 38(8), 904–909 (2006)

    Article  Google Scholar 

  53. R.A. Rossi, N.K. Ahmed, The network data repository with interactive graph analytics and visualization, in Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence (2015). http://networkrepository.com

  54. G. Salton, M. McGill, Introduction to Modern Information Retrieval (McGraw-Hill, New York, 1983)

    MATH  Google Scholar 

  55. B. Sarwar, G. Karypis, J. Konstan, J. Riedl, Analysis of recommendation algorithms for e-commerce, in Proceedings of the 2nd ACM Conference on Electronic Commerce (2000), pp 158–167

    Google Scholar 

  56. A. Stathopoulos, K. Wu, A block orthogonalization procedure with constant synchronization requirements. SIAM J. Sci. Comput. 23(6), 2165–2182 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  57. W. Tan, L. Cao, L.L. Fong, Faster and cheaper: Parallelizing large-scale matrix factorization on gpus. CoRR abs/1603.03820 (2016). http://arxiv.org/abs/1603.03820

  58. J. Tougas, R. Spiteri, Updating the partial singular value decomposition in latent semantic indexing. Comput. Statist. Data Anal. 52, 174–183 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  59. E. Vecharynski, Y. Saad, Fast updating algorithms for latent semantic indexing. SIAM J. Matrix Anal. Appl. 35(3), 1105–1131 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  60. T. White, Hadoop: The Definitive Guide (O’Reilly Media, Inc., Sebastopol, 2012)

    Google Scholar 

  61. K. Wu, H. Simon, Thick-restart Lanczos method for large symmetric eigenvalue problems. SIAM J. Matrix Anal. Appl. 22(2), 602–616 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  62. I. Yamazaki, K. Wu, A communication-avoiding thick-restart lanczos method on a distributed-memory system, in Proceedings of the 2011 International Conference on Parallel Processing, Euro-Par’11 (Springer, Berlin, 2012), pp. 345–354

    Google Scholar 

  63. I. Yamazaki, H. Anzt, S. Tomov, M. Hoemmen, J. Dongarra Improving the performance of CA-GMRES on multicores with multiple GPUs, in Proceedings of the IEEE International Parallel and Distributed Symposium (IPDPS) (2014a), pp. 382–391

    Google Scholar 

  64. I. Yamazaki, T. Mary, J. Kurzak, S. Tomov, Access-averse framework for computing low-rank matrix approximations, in Proceedings of the International Workshop on High Performance Big Graph Data Management, Analysis, and Minig (2014b), pp. 70–77

    Google Scholar 

  65. I. Yamazaki, S. Rajamanickam, E. Boman, M. Hoemmen, M. Heroux, S. Tomov, Domain decomposition preconditioners for communication-avoiding Krylov methods on a hybrid CPU/GPU cluster, in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC) (2014c), pp. 933–944

    Google Scholar 

  66. I. Yamazaki, J. Kurzak, P. Luszczek, J. Dongarra, Randomized algorithms to update partial singular value decomposition on a hybrid CPU/GPU cluster, in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC) (2015), pp. 345–354

    Google Scholar 

  67. M. Zaharia, M. Chowdhury, M.J. Franklin, S. Shenker, I. Stoica, Spark: cluster computing with working sets, in Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, vol. 10 (2010), p.10

    Google Scholar 

  68. H. Zha, H. Simon, On updating problems in latent semantic indexing. SIAM J. Sci. Comput. 21(2), 782–791 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  69. H. Zha, O. Marques, H. Simon, Large-scale SVD and subspace-based methods for information retrieval, in Solving Irregularly Structured Problems in Parallel, vol. 1457, Lecture Notes in Computer Science, ed. by A. Ferreira, J. Rolim, H. Simon, S.-H. Teng (Springer, Heidelberg, 1998), pp. 29–42

    Chapter  Google Scholar 

  70. Y. Zhou, D. Wilkinson, R. Schreiber, R. Pan, Large-scale parallel collaborative filtering for the netflix prize in Proceedings of the 4th International Conference on Algorithmic Aspects in Information and Management, AAIM’08 (Springer, Berlin, 2008), pp. 337–348

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to J. Kurzak .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Anzt, H. et al. (2017). Bringing High Performance Computing to Big Data Algorithms. In: Zomaya, A., Sakr, S. (eds) Handbook of Big Data Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-49340-4_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49340-4_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49339-8

  • Online ISBN: 978-3-319-49340-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics