The Journal of Supercomputing

, Volume 75, Issue 3, pp 1470–1482 | Cite as

A pipeline structure for the block QR update in digital signal processing

  • Manuel F. Dolz
  • Fran J. Alventosa
  • Pedro Alonso-JordáEmail author
  • Antonio M. Vidal


There exist problems in the field of digital signal processing, such as filtering of acoustic signals that require processing a large amount of data in real time. The beamforming algorithm, for instance, is a process that can be modeled by a rectangular matrix built on the input signals of an acoustic system and, thus, changes in real time. To obtain the output signals, it is required to compute its QR factorization. In this paper, we propose to organize the concurrent computational resources of a given multicore computer in a pipeline structure to perform this factorization as fast as possible. The pipeline has been implemented using both the application programming interface OpenMP and GrPPI, a library interface to design parallel applications based on parallel patterns. We tackle not only the performance challenge but also the programmability of our idea using parallel programming frameworks.


QR factorization QR update Pipeline QR update Jagged matrix GrPPI Beamforming algorithm 



This work was supported by the Spanish Ministry of Economy and Competitiveness under MINECO and FEDER projects TIN2014-53495-R and TEC2015-67387-C4-1-R.


  1. 1.
    Huang Y, Benesty J, Chen J (2006) Acoustic MIMO signal processing (signals and communication technology). Springer, BerlinCrossRefzbMATHGoogle Scholar
  2. 2.
    Ramiro C, Vidal AM, González A (2015) MIMOPack: a high performance computing library for MIMO communication systems. J Supercomput 71:751–760CrossRefGoogle Scholar
  3. 3.
    Alventosa FJ, Alonso P, Piñero G, Vidal AM (2016) Implementation of the Beamformer algorithm for the NVIDIA Jetson. In: Actas de la Conferencia, Granada, Spain, pp 201–211. ISBN 978-3-319-49955-0Google Scholar
  4. 4.
    Alventosa FJ, Alonso P, Vidal AM, Piñero G, Quintana-Ortí ES (2018) Fast block QR update in digital signal processing. J Supercomput.
  5. 5.
    del Rio D, Dolz MF, Fernández J, García JD (2017) A generic parallel pattern interface for stream and data processing. Concurr Comput Pract Exp 29(24):e4175CrossRefGoogle Scholar
  6. 6.
    Benesty J, Chen J, Huang Y, Dmochowski J (2007) On microphone-array Beamforming from a MIMO acoustic signal processing perspective. IEEE Trans Audio Speech Lang Process 15(3):1053–1065CrossRefGoogle Scholar
  7. 7.
    Lorente J, Piñero G, Vidal AM, Belloch JA, González A (2011) Parallel implementations of Beamforming design and filtering for microphone array applications. In: 19th European Signal Processing Conference (EUSIPCO), Barcelona, Spain, pp 501–505Google Scholar
  8. 8.
    Belloch JA, Ferrer M, González A, Martínez-Zaldívar FJ, Vidal AM (2013) Headphone-based virtual spatialization of sound with a GPU accelerator. J Audio Eng Soc 61:546–561Google Scholar
  9. 9.
    Belloch JA, González A, Martínez-Zaldívar FJ, Vidal AM (2011) Real-time massive convolution for audio applications on GPU. J Supercomput 58(3):449–457CrossRefGoogle Scholar
  10. 10.
    Golub GH, Van Loan CF (2013) Matrix computations. Johns Hopkins studies in the mathematical sciences. Johns Hopkins University Press, BaltimoreGoogle Scholar
  11. 11.
    Gunter BC, van de Geijn RA (2005) Parallel out-of-core computation and updating the QR factorization. ACM Trans Math Softw 31(1):60–78MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Buttari A, Langou J, Kurzak J, Dongarra J (2009) A class of parallel tiled linear algebra algorithms for multicore architectures. Parallel Comput 35(1):38–53MathSciNetCrossRefGoogle Scholar
  13. 13.
    Dolz MF, Alventosa FJ, Alonso-Jordá P, Vidal AM (2018) A pipeline for the QR update in digital signal processing. In: Proceedings of the 18th International Conference on Computational and Mathematical Methods in Science and Engineering (CMMSE 2018), Rota, Cádiz, Spain, pp 1–5Google Scholar
  14. 14.
    Quintana-Ortí G, Quintana-Ortí ES, Van De Geijn RA, Van Zee FG, Chan E (2009) Programming matrix algorithms-by-blocks for thread-level parallelism. ACM Trans Math Softw 36(3):14:1–14:26MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Departament d’Enginyeria i Ciència dels ComputadorsUniversitat Jaume I de CastellóCastellónSpain
  2. 2.Depto. de Sistemas Informáticos y ComputaciónUniversitat Politècnica de ValènciaValenciaSpain

Personalised recommendations