High-Throughput FFT-SPA Decoder Implementation for Non-Binary LDPC Codes on x86 Multicore Processors

  • Bertrand Le GalEmail author
  • Christophe Jego


Low-Density Parity-Check (LDPC) codes are a well known Error Correction Code family used for instance, in wireless and satellite communication links. Error correction performance of LDPC codes was further enhanced by extending it to higher order Galois fields, giving rise hence to the so-called non-binary LDPC codes (NB-LDPC). Error correction performance improvement (CCSDS 2014) is the main reason behind the adoption by the Consultative Committee for Space Data Systems (CCSDS) of the NB-LDPC codes in the experimental specification for the future next generation uplinks (CCSDS 2014, 2015). The high error correction efficiency for short frames make NB-LDPC codes a good candidate for IoT applications. However, the performance gain comes at the expense of a high decoding computational complexity (CCSDS 2014; Conde-Canencia et al. 2009). In this paper, an x86 multicore NB-LDPC decoder implementation is provided. This decoder that implements the FFT-SPA algorithm provides a throughput improvement of about 1.3 × to 2.7 ×, a latency reduction of more than 95% and a power consumption halved in comparison with the most efficient works on GPU (Graphics Processing Unit) device. Indeed, an efficient memory mapping and computation optimizations on the x86 architecture enable to achieve a higher decoding throughput than the GPU-based in similar experimental setup. Consequently, the throughput efficiency, the low processing latency associated with a low power consumption makes this proposed multicore implementation practical and attractive for real time implementations of NB-LDPC decoders in future SDR or Cloud-RAN systems for CCSDS standard and IoT applications.


NB-LDPC FFT-SPA Multi-core SIMD Error correction codes 



The authors would also like to thank INTEL Inc., for providing access to vLab Academic Research Environment.


  1. 1.
    Consultative Committee for Space Data Systems (CCSDS). (2014). CCSDS 231.1-O-1 - Green Book - Next Generation Uplink - Informational RePORT CCSDS 230.2-G-1. Washington, DC, USA.Google Scholar
  2. 2.
    Consultative Committee for Space Data Systems (CCSDS). (2015). CCSDS 231.1-O-1 - Orange Book - Short Block Length LDPC codes for TC Synchronization and Channel Coding Washington, DC, USA.Google Scholar
  3. 3.
    Conde-Canencia, L., Al-Ghouwayel, A., Boutillon, E. (2009). Complexity comparison of Non-Binary LDPC decoders. In Proceedings of the ICT mobile summit conference, pp. 1–8, Santander, Spain.Google Scholar
  4. 4.
    Gallager, R.G. (1962). Low density parity check codes. IRE Transactions on Information Theory, 8(1), 21–28.MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    DVB Document A83-2. (2014). Digital video broadcasting (DVB) - part II: S2-extensions (DVB-s2x) march.Google Scholar
  6. 6.
    Consultative Committee for Space Data Systems (CCSDS). (2011). CCSDS 131.0-B-2 - TM synchronization and channel coding. Washington, DC, USA, blue book edition.Google Scholar
  7. 7.
    Davey, M.C., & MacKay, D.J.C. (1998). Low density parity check codes over GF(q). IEEE Communications Letters, 2(6), 165–167.CrossRefGoogle Scholar
  8. 8.
    Barnault, L., & Declercq, D. (2003). Fast decoding algorithm for LDPC over GF(2q). In Proceedings of the IEEE information theory workshop (pp. 70–73).Google Scholar
  9. 9.
    Dolecek, L., Divsalar, D., Sun, Y., Amiri, B. (2014). Non-binary protograph-based LDPC codes: enumerators, analysis, and designs. IEEE Transactions on Information Theory, 60(7), 3913–3941.MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Wang, G, Shen, H., Yin, B., Wu, M., Sun, Y., Cavallaro, J.R. (2012). Parallel nonbinary LDPC decoding on GPU. In Proceedings of the conference record of 46th Asilomar conference on signals, systems and computers (ASILOMAR) (pp. 1277–1281).Google Scholar
  11. 11.
    Andrade, J., Falcao, G., Silva, V., Kasai, K. (2013). FFT-SPA non-binary LDPC decoding on GPU. In Proceedings of the IEEE international conference on acoustics, speech and signal processing (ICASSP). Vancouver, BC (pp. 5099–5103).Google Scholar
  12. 12.
    Beermann, M., Monzo, E., Schmalen, L., Vary, P. (2013). High speed decoding of non-binary irregular LDPC codes using GPUs. In Proceedings of the IEEE workshop on signal processing systems (SiPS) (pp. 36–41).Google Scholar
  13. 13.
    Andrade, J., Falcao, G., Silva, V. (2014). Optimized fast walsh-hadamard transform on GPUs, for non-binary LDPC decoding. Elsevier Journal of Parallel Computing, 40(9), 449–453.MathSciNetCrossRefGoogle Scholar
  14. 14.
    Thi, H.P., Ajaz, S., Lee, H. (2014). Efficient Min-Max nonbinary LDPC decoding on GPU. In Proceedings of the international SoC design conference (ISOCC) (pp. 266–267).Google Scholar
  15. 15.
    Beermann, M., Schmalen, L., Vary, P. (2015). GPU accelerated belief propagation decoding of non-binary LDPC codes with parallel and sequential scheduling. Springer, Journal of Signal Processing Systems, 78(1), 21–34.CrossRefGoogle Scholar
  16. 16.
    Pham, H.T., Ajaz, S., Lee, H. (2015). Parallel block-layered nonbinary QC-LDPC decoding on GPU. In Proceedings of the IEEE workshop on signal processing systems (SiPS) (pp. 1–6).Google Scholar
  17. 17.
    Liu, Z., Liu, R., Hou, Y., Zhao, L. (2018). High-throughput multi-codeword decoder for non-binary LDPC codes on GPU. IEEE Communications Letters, 22, 486–489.CrossRefGoogle Scholar
  18. 18.
    Boutillon, E., Conde-Canencia, L., Ghouwayel, A.A. (2013). Design of a GF(64)-LDPC decoder based on the EMS algorithm. IEEE Transactions on Circuits and Systems (TCAS-I), 60(10), 2644–2656.MathSciNetCrossRefGoogle Scholar
  19. 19.
    Abassi, O, Conde-Canencia, L, Ghouwayel, A.A., Boutillon, E. (2017). A novel architecture for elementary check node processing in non-binary LDPC decoders. IEEE Transactions on Circuits and Systems II, (TCAS-II), 64 (2), 136–140.CrossRefGoogle Scholar
  20. 20.
    Chang, C.-C., Chang, Y.-L., Huang, M.-Y., Huang, B. (2011). Accelerating regular LDPC code decoders on GPUs. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 4(3), 653–659.CrossRefGoogle Scholar
  21. 21.
    Falcao, G., Silva, V., Sousa, L., Andrade, J. (2012). Portable LDPC decoding on multicores using OpenCL. IEEE Signal Processing Magazine, 29(4), 81–88.CrossRefGoogle Scholar
  22. 22.
    Wang, G., Wu, M., Yin, B., Cavallaro, J.R. (2013). High throughput low latency LDPC decoding on GPU for SDR systems. In Proceedings of the IEEE GlobalSIP conference (pp. 1258–1261).Google Scholar
  23. 23.
    Li, R., Dou, Y., Zou, D., Wang, S., Zhang, Y. (2013). Efficient graphics processing unit based layered decoders for quasicyclic low-density parity-check codes. Concurrency and Computation: Practice and Experience, 27 (1), 29–46.CrossRefGoogle Scholar
  24. 24.
    Lin, Y., & Niu, W. (2014). High throughput LDPC decoder on GPU. IEEE Communications Letters, 18(2), 344–347.CrossRefGoogle Scholar
  25. 25.
    Le Gal, B., & Jego, C. (2016). High-throughput multi-core LDPC decoders based on x86 processor. IEEE Transactions on Parallel and Distributed Systems (TPDS), 27(5), 1373–1386.CrossRefGoogle Scholar
  26. 26.
    Andrade, J., Falcao, G., Silva, V., Sousa, L. (2016). A survey on programmable LDPC decoders. IEEE Access, 4, 6704–6718.CrossRefGoogle Scholar
  27. 27.
    Le Gal, B., Leroux, C., Jego, C. (2014). Software polar decoder on an embedded processor. In Proceedings of the IEEE workshop on signal processing systems (SiPS) (pp. 1–6).Google Scholar
  28. 28.
    Le Gal, B., Leroux, C., Jego, C. (2015). Multi-Gb/s, software decoding of polar codes. IEEE Transactions on Signal Processing, 63(2), 349–359.MathSciNetCrossRefzbMATHGoogle Scholar
  29. 29.
    Giard, P., Sarkis, G., Leroux, C., Thibeault, C., Gross, W.J. (2018). Low-latency software polar decoders. Journal of Signal Processing Systems, Springer, 90, 761–775.CrossRefGoogle Scholar
  30. 30.
    Grayver, E. (2013). Implementing software defined radio. New York: Springer.CrossRefGoogle Scholar
  31. 31.
    Cassagne, A., Tonnellier, T., Leroux, C., Le Gal, B., Aumage, O., Barthou, D. (2016). Beyond Gbps turbo decoder on multi-core CPUs. In Proceedings of the 9th International Symposium on Turbo Codes & Iterative Information Processing (ISTC’16), Brest, France.Google Scholar
  32. 32.
    Carrasco, R.A., & Johnston, M. (2008). Non-Binary error control coding for wireless communication and data storage. Chichester: Wiley.CrossRefGoogle Scholar
  33. 33.
    Hocevar, D.E. (2004). A reduced complexity decoder architecture via layered decoding of LDPC codes. In Proceedings of the IEEE workshop on signal processing systems (SIPS) workshop (pp. 107–112).Google Scholar
  34. 34.
    Declercq, D., & Fossorier, M. (2007). Decoding algorithms for nonbinary LDPC codes over GF. IEEE Transactions on Communications, 55(4), 633–643.CrossRefGoogle Scholar
  35. 35.
    Voicila, A., Declercq, D., Verdier, F., Fossorier, M., Urard, P. (2007). Low complexity, low memory EMS algorithm for non-binary LDPC codes. In IEEE Internationnal Conference on Communications (ICC’07), Glasgow, England.Google Scholar
  36. 36.
    Le Gal, B., & Jego, C. (2015). High-throughput LDPC decoder on low-power embedded processors. IEEE Communications Letters, 19(11), 1861–1864.CrossRefGoogle Scholar
  37. 37.
    Gronroos, S., Nybom, K., Bjorkqvist, J. (2012). Efficient GPU and CPU-based LDPC decoders for long codewords. Journal of Analog Integrated Circuits and Signal Processing, 73, 583—595.CrossRefGoogle Scholar
  38. 38.
    Wymeersch, H., Steendam, H., Moeneclaey, M. (2004). Log-domain decoding of LDPC codes over GF(q). In IEEE international conference on communications (ICC’04), pp. 772–776, Paris, France.Google Scholar
  39. 39.
    Poulliat, C., Fossorier, M., Declercq, D. (2008). Design of regular (2, d c)-LDPC codes over GF(q) using their binary images. IEEE Transactions on Communications, 56(10), 1626–1635.CrossRefGoogle Scholar
  40. 40.
    Helmling, M., & Scholl, S. Database of channel codes and ml simulation results. University of Kaiserslautern,
  41. 41.
    Ying, Y., You, K., Zhou, L., Quan, H., Zeng, X. (2012). A pure software LDPC decoder on a multi-core processor platform with reduced inter-processor communication cost. In Proceedings of the ISCAS Symposium (pp. 2609–2612).Google Scholar
  42. 42.
    Le Gal, B., Jego, C., Crenne, J. (2014). A high throughput efficient approach for decoding LDPC codes onto GPU devices. IEEE Embedded Systems Letters, 6(2), 29–32.CrossRefGoogle Scholar
  43. 43.
    Peng, H., Liu, R., Hou, Y., Zhao, L. (2016). A Gb/s parallel block-based Viterbi decoder for convolutional codes on GPU. In Proceedings of the 8th international conference on wireless communications & signal processing (WCSP) (p. 1-6).Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.IMS Lab.Univ. BordeauxBordeaux INPFrance

Personalised recommendations