Implementations of Main Algorithms for Generalized Symmetric Eigenproblem on GPU Accelerator

Zhao, Yonghua; Liu, Fang; Wang, Yangang; Chi, Xuebin

doi:10.1007/978-3-642-16405-7_33

Implementations of Main Algorithms for Generalized Symmetric Eigenproblem on GPU Accelerator

Yonghua Zhao⁷,
Fang Liu⁷,
Yangang Wang⁷ &
…
Xuebin Chi⁷

Chapter
First Online: 01 January 2013

2811 Accesses

Part of the book series: Lecture Notes in Earth System Sciences ((LNESS))

Abstract

To solve a generalized eigensystem problem, we firstly need to transform the generalized eigenproblem to a standard eigenproblem, and then reduce a matrix to tridiagonal form. These are based on both blocked Cholesky decomposition and blocked Householder tridiagonalization method. We present parallel implementations of standard transformation which combines the Cholesky into the transformation from generalized to standard form, and reduction of a dense matrix to tridiagonal form on GPU accelerator using CUBLAS. Experimental results clearly demonstrate the potential of data-parallel coprocessors for scientific computations. When comparing against the CPU implementation, the GPU implementations achieve above 16-fold and 20-fold speedups in double precision respectively.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Agullo E, Augonnet C, Dongarra J, Faverge M, Ltaief H, Thibault S, Tomov S (2010) QR factorization on a multicore node enhanced with multiple GPU accelerators, University of Tennessee Computer Science, Tech. Rep. ICL-UT-10-04.
Google Scholar
Anderson E, Bai Z, Bischof C, Blackford S, Demmel J, Dongarra J, Du Croz J, Greenbaum A, Hammarling S, McKenney A et al. (1999), LAPAck Users’ Guide SIAM.
Google Scholar
Barrachina S, Castillo M, Igual FD, Mayo R, Quintana-Ort‘ı ES, Quintana-Ort‘ı G, Exploiting the capabilities of modern GPUs for dense matrix computations, Concurrency and Computation: Practice and Experience, 21(18):2457–2477, 2009. (Online). Available: http://dx.doi.org/10.1002/cpe.1472
Bientinesi P, Dhillon IS, van de Geijn RA (2005) A parallel eigensolver for dense symmetric matrices based on multiple relatively robust representations. SIAM J Sci Comput 27(1):43–66
Google Scholar
Bischof C, Van Loan C (1987) The WY representation for products of Householder matrices, SIAM J. Sci. Stat. Comp. 8, no. 1, S2–S13, Parallel processing for scientific computing (Norfolk, Va., 1985). MR 88f:65070.
Google Scholar
Blackford L, Cleary A, Choi J, d’Azevedo d’Azevedo E, Demmel J, Dhillon I, Dongarra J, Hammarling S, Henry G, Petitet A et al. (1997), ScaLAPACK users’ guide. Society for Industrial Mathematics.
Google Scholar
Cao X, Chi X, Gu N (2002) Parallel solving symmetric eigenproblems. 5th international conference on algorithms and architectures for parallel processing. IEEE, Beijing, China.
Google Scholar
Dhillon Inderjit S, Parlett Beresford N (2004) Multiple representations to compute orthogonal eigenvectors of symmetric tridiagonal matrices. Linear Algebra Appl 387:1–28
Google Scholar
Igual FD, Quintana-Ort’ı G, van de Geijn R (2009) Level-3 BLAS on a GPU: Picking the low hanging fruit. FLAME Working Note #37. DICC 2009–04-01, Universitat Jaume I. Dept. ICC.
Google Scholar
Lessig C, Bientinesi P (2009) On parallelizing the MRRR algorithm for data-parallel coprocessors, In: Wyrzykowski R, Dongarra J, Karczewski K, Wasniewski J (eds) in PPAM (1), ser. Lecture Notes in Computer Science, vol 6067. Springer, Heidelberg, pp 396–402 (Online)
Google Scholar
Ltaief H, Tomov S, Nath R, Du P, Dongarra J (2009) A Scalable High Performant Cholesky Factorization for Multicore with GPU Accelerators, Tech. report, LAPACK Working Note 223, Tech. Rep.
Google Scholar
Nvidia C (2008) CUBLAS library. NVIDIA Corporation, Santa Clara, California
Google Scholar
Tomov S, Nath R, Ltaief H, Dongarra J, (2010) Dense linear algebra solvers for multicore with GPU accelerators, IEEE international symposium on parallel and distributed processing, workshops and Ph.D. forum (IPDPSW), pp 1–8.
Google Scholar
Tomov S, Nath R, Du P, Dongarra J (2009) MAGMA version 0.2 User Guide
Google Scholar
Tomov S, Nath R, Dongarra J (2010) Accelerating the reduction to upper Hessenberg, tridiagonal, and bidiagonal forms through hybrid GPU-based computing. Parallel Comput 36(12):645–654
Google Scholar
Volkov V, Demmel J, LU, QR and Cholesky factorizations using vector capabilities of GPUs, EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2008-49, pp 2008–49
Google Scholar
Volkov V, Demmel JW (2008) Using GPUs to accelerate the bisection algorithm for finding eigenvalues of symmetric tridiagonal matrices, Department of Computer Science, University of Tennessee, Knoxville, inst-UT-CS:adr, LAPACK Working Note 197, (Online).
Google Scholar

Download references

Acknowledgments

This work is supported by the National Science Foundation of China (Grant No. 60873113) and 863 Program (2009AA01A134, 2010AA012301).

Author information

Authors and Affiliations

Supercomputing Center, Computer Network Information Center, Chinese Academy of Sciences, 100190 , Beijing, China
Yonghua Zhao, Fang Liu, Yangang Wang & Xuebin Chi

Authors

Yonghua Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Fang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yangang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xuebin Chi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yonghua Zhao .

Editor information

Editors and Affiliations

University of Minnesota, Dep. of Earth Sciences and Minnesota, Supercomputing Institute, Pillsbury Hall 23, Minneapolis, 55455, Minnesota, USA
David A. Yuen
Network Information Center, Comuter Center and Computer, Zhong Guan Cun 4, Beijing, 100190, China, People's Republic
Long Wang
Supercomputing Center, Zhong Guan Cun 4, Beijing, 100190, China, People's Republic
Xuebin Chi
, Computer Science, University of Houston, Calhoun Street 4800, Houston, 77204, Texas, USA
Lennart Johnsson
Inst. Process Engineering (IPE), Chinese Academy of Sciences, Zhongguancun North Second Street 1, Beijing, 100190, China, People's Republic
Wei Ge
, Laboratory of Computational Geodynamics,, Chinese Academy of Sciences, Yu Quan Lu 19a, Beijing, 100049, China, People's Republic
Yaolin Shi

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zhao, Y., Liu, F., Wang, Y., Chi, X. (2013). Implementations of Main Algorithms for Generalized Symmetric Eigenproblem on GPU Accelerator. In: Yuen, D., Wang, L., Chi, X., Johnsson, L., Ge, W., Shi, Y. (eds) GPU Solutions to Multi-scale Problems in Science and Engineering. Lecture Notes in Earth System Sciences. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16405-7_33

Download citation

DOI: https://doi.org/10.1007/978-3-642-16405-7_33
Published: 09 January 2013
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16404-0
Online ISBN: 978-3-642-16405-7
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)

Publish with us

Policies and ethics