Advertisement

Hardware architectures for the fast Fourier transform

  • Mario Garrido
  • Fahad Qureshi
  • Jarmo Takala
  • Oscar Gustafsson
Chapter

Abstract

The fast Fourier transform (FFT) is a widely used algorithm in signal processing applications. FFT hardware architectures are designed to meet the requirements of the most demanding applications in terms of performance, circuit area, and/or power consumption. This chapter summarizes the research on FFT hardware architectures by presenting the FFT algorithms, the building blocks in FFT hardware architectures, the architectures themselves, and the bit reversal algorithm.

References

  1. 1.
    Ahmed, T., Garrido, M., Gustafsson, O.: A 512-point 8-parallel pipelined feedforward FFT for WPAN. In: Proc. Asilomar Conf. Signals Syst. Comput., pp. 981–984 (2011)Google Scholar
  2. 2.
    Andraka, R.: A survey of CORDIC algorithms for FPGA based computers. In: Proc. ACM/SIGDA Int. Symp. Field-Programmable Gate Arrays, pp. 191–200. ACM (1998)Google Scholar
  3. 3.
    Argüello, F., Bruguera, J., Doallo, R., Zapata, E.: Parallel architecture for fast transforms with trigonometric kernel. IEEE Trans. Parallel Distrib. Syst. 5(10), 1091–1099 (1994)CrossRefGoogle Scholar
  4. 4.
    Bi, G., Jones, E.: A pipelined FFT processor for word-sequential data. IEEE Trans. Acoust., Speech, Signal Process. 37(12), 1982–1985 (1989)CrossRefGoogle Scholar
  5. 5.
    Boullis, N., Tisserand, A.: Some optimizations of hardware multiplication by constant matrices. IEEE Trans. Comput. 54(10), 1271–1282 (2005)CrossRefGoogle Scholar
  6. 6.
    Burrus, C., Eschenbacher, P.: An in-place, in-order prime factor FFT algorithm. Proc. IEEE Int. Symp. Circuits Syst. 29(4), 806–817 (1981)zbMATHGoogle Scholar
  7. 7.
    Chakraborty, T.S., Chakrabarti, S.: On output reorder buffer design of bit reversed pipelined continuous data FFT architecture. In: Proc. IEEE Asia-Pacific Conf. Circuits Syst., pp. 1132–1135. IEEE (2008)Google Scholar
  8. 8.
    Chan, S.C., Yiu, P.M.: An efficient multiplierless approximation of the fast Fourier transform using sum-of-powers-of-two (SOPOT) coefficients. IEEE Signal Process. Lett. 9(10), 322–325 (2002)CrossRefGoogle Scholar
  9. 9.
    Chang, Y.N.: An efficient VLSI architecture for normal I/O order pipeline FFT design. IEEE Trans. Circuits Syst. II 55(12), 1234–1238 (2008)CrossRefGoogle Scholar
  10. 10.
    Chang, Y.N.: Design of an 8192-point sequential I/O FFT chip. In: Proc. World Congress Eng. Comp. Science, vol. II (2012)Google Scholar
  11. 11.
    Chen, J., Hu, J., Lee, S., Sobelman, G.E.: Hardware efficient mixed radix-25/16/9 FFT for LTE systems. IEEE Trans. VLSI Syst. 23(2), 221–229 (2015)CrossRefGoogle Scholar
  12. 12.
    Cheng, C., Yu, F.: An optimum architecture for continuous-flow parallel bit reversal. IEEE Signal Process. Lett. 22(12), 2334–2338 (2015)CrossRefGoogle Scholar
  13. 13.
    Cho, S.I., Kang, K.M.: A low-complexity 128-point mixed-radix FFT processor for MB-OFDM UWB systems. ETRI J. 32(1), 1–10 (2010)CrossRefGoogle Scholar
  14. 14.
    Cho, S.I., Kang, K.M., Choi, S.S.: Implementation of 128-point fast Fourier transform processor for UWB systems. In: Proc. Int. Wireless Comm. Mobile Comp. Conf., pp. 210–213 (2008)Google Scholar
  15. 15.
    Cohen, D.: Simplified control of FFT hardware. IEEE Trans. Acoust., Speech, Signal Process. 24(6), 577–579 (1976)CrossRefGoogle Scholar
  16. 16.
    Cooley, J.W., Tukey, J.W.: An algorithm for the machine calculation of complex Fourier series. Math. Comput. 19, 297–301 (1965)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Cortés, A., Vélez, I., Sevillano, J.F.: Radix r k FFTs: Matricial representation and SDC/SDF pipeline implementation. IEEE Trans. Signal Process. 57(7), 2824–2839 (2009)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Dempster, A.G., Macleod, M.D.: Multiplication by two integers using the minimum number of adders. In: Proc. IEEE Int. Symp. Circuits Syst., vol. 2, pp. 1814–1817 (2005)Google Scholar
  19. 19.
    Despain, A.M.: Fourier transform computers using CORDIC iterations. IEEE Trans. Comput. C-23, 993–1001 (1974)CrossRefGoogle Scholar
  20. 20.
    Duhamel, P., Hollmann, H.: ’Split radix’ FFT algorithm. Electron. Lett. 20(1), 14–16 (1984)CrossRefGoogle Scholar
  21. 21.
    Edelman, A., Heller, S., Johnsson, L.: Index transformation algorithms in a linear algebra framework. IEEE Trans. Parallel Distrib. Syst. 5(12), 1302–1309 (1994)CrossRefGoogle Scholar
  22. 22.
    Fraser, D.: Array permutation by index-digit permutation. J. Assoc. Comp. Machinery (ACM) 23(2), 298–309 (1976)MathSciNetCrossRefGoogle Scholar
  23. 23.
    Garrido, M.: Efficient hardware architectures for the computation of the FFT and other related signal processing algorithms in real time. Ph.D. thesis, Universidad Politécnica de Madrid (2009)Google Scholar
  24. 24.
    Garrido, M.: A new representation of FFT algorithms using triangular matrices. IEEE Trans. Circuits Syst. I 63(10), 1737–1745 (2016)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Garrido, M., Acevedo, M., Ehliar, A., Gustafsson, O.: Challenging the limits of FFT performance on FPGAs. In: Int. Symp. Integrated Circuits, pp. 172–175 (2014)Google Scholar
  26. 26.
    Garrido, M., Andersson, R., Qureshi, F., Gustafsson, O.: Multiplierless unity-gain SDF FFTs. IEEE Trans. VLSI Syst. 24(9), 3003–3007 (2016)CrossRefGoogle Scholar
  27. 27.
    Garrido, M., Grajal, J.: Efficient memoryless CORDIC for FFT computation. In: Proc. IEEE Int. Conf. Acoust. Speech Signal Process., vol. 2, pp. 113–116 (2007)Google Scholar
  28. 28.
    Garrido, M., Grajal, J., Gustafsson, O.: Optimum circuits for bit reversal. IEEE Trans. Circuits Syst. II 58(10), 657–661 (2011)CrossRefGoogle Scholar
  29. 29.
    Garrido, M., Grajal, J., Sánchez, M.A., Gustafsson, O.: Pipelined radix-2k feedforward FFT architectures. IEEE Trans. VLSI Syst. 21(1), 23–32 (2013)CrossRefGoogle Scholar
  30. 30.
    Garrido, M., Gustafsson, O., Grajal, J.: Accurate rotations based on coefficient scaling. IEEE Trans. Circuits Syst. II 58(10), 662–666 (2011)CrossRefGoogle Scholar
  31. 31.
    Garrido, M., Huang, S.J., Chen, S.G.: Feedforward FFT hardware architectures based on rotator allocation. IEEE Trans. Circuits Syst. I 65(2), 581–592 (2018)CrossRefGoogle Scholar
  32. 32.
    Garrido, M., Huang, S.J., Chen, S.G., Gustafsson, O.: The serial commutator (SC) FFT. IEEE Trans. Circuits Syst. II 63(10), 974–978 (2016)CrossRefGoogle Scholar
  33. 33.
    Garrido, M., Källström, P., Kumm, M., Gustafsson, O.: CORDIC II: A new improved CORDIC algorithm. IEEE Trans. Circuits Syst. II 63(2), 186–190 (2016)CrossRefGoogle Scholar
  34. 34.
    Garrido, M., Qureshi, F., Gustafsson, O.: Low-complexity multiplierless constant rotators based on combined coefficient selection and shift-and-add implementation (CCSSI). IEEE Trans. Circuits Syst. I 61(7), 2002–2012 (2014)CrossRefGoogle Scholar
  35. 35.
    Garrido, M., Sánchez, M., López-Vallejo, M., Grajal, J.: A 4096-point radix-4 memory-based FFT using DSP slices. IEEE Trans. VLSI Syst. 25(1), 375–379 (2017)CrossRefGoogle Scholar
  36. 36.
    Glittas, A.X., Sellathurai, M., Lakshminarayanan, G.: A normal I/O order radix-2 FFT architecture to process twin data streams for MIMO. IEEE Trans. VLSI Syst. 24(6), 2402–2406 (2016)CrossRefGoogle Scholar
  37. 37.
    Gold, B., Rader, C.M.: Digital Processing of Signals. New York: McGraw Hill (1969)zbMATHGoogle Scholar
  38. 38.
    Good, I.J.: The interaction algorithm and practical Fourier analysis. J. Royal Statistical Society B 20(2), 361–372 (1958)MathSciNetzbMATHGoogle Scholar
  39. 39.
    Granata, J., Conner, M., Tolimieri, R.: Recursive fast algorithm and the role of the tensor product. IEEE Trans. Signal Process. 40(12), 2921–2930 (1992)CrossRefGoogle Scholar
  40. 40.
    Gustafsson, O.: A difference based adder graph heuristic for multiple constant multiplication problems. In: Proc. IEEE Int. Symp. Circuits Syst., pp. 1097–1100. IEEE (2007)Google Scholar
  41. 41.
    Gustafsson, O.: On lifting-based fixed-point complex multiplications and rotations. In: Proc. IEEE Symp. Comput. Arithmetic (2017)Google Scholar
  42. 42.
    Gustafsson, O., Dempster, A.G., Johansson, K., Macleod, M.D., Wanhammar, L.: Simplified design of constant coefficient multipliers. Circuits Syst. Signal Process. 25(2), 225–251 (2006)MathSciNetCrossRefGoogle Scholar
  43. 43.
    Gustafsson, O., Qureshi, F.: Addition aware quantization for low complexity and high precision constant multiplication. IEEE Signal Processing Letters 17(2), 173–176 (2010)CrossRefGoogle Scholar
  44. 44.
    Gustafsson, O., Wanhammar, L.: Arithmetic. In: S.S. Bhattacharyya, E.F. Deprettere, R. Leupers, J. Takala (eds.) Handbook of Signal Processing Systems, third edn. Springer (2018)Google Scholar
  45. 45.
    He, S., Torkelson, M.: Design and implementation of a 1024-point pipeline FFT processor. pp. 131–134 (1998)Google Scholar
  46. 46.
    Hsiao, C.F., Chen, Y., Lee, C.Y.: A generalized mixed-radix algorithm for memory-based FFT processors. IEEE Trans. Circuits Syst. II 57(1), 26–30 (2010)CrossRefGoogle Scholar
  47. 47.
    Hsiao, S.F., Lee, C.H., Cheng, Y.C., Lee, A.: Designs of angle-rotation in digital frequency synthesizer/mixer using multi-stage architectures. In: Proc. Asilomar Conf. Signals Syst. Comput., pp. 2181–2185 (2011)Google Scholar
  48. 48.
    Hu, Y., Naganathan, S.: An angle recoding method for CORDIC algorithm implementation. IEEE Trans. Comput. 42(1), 99–102 (1993)CrossRefGoogle Scholar
  49. 49.
    Huang, S.J., Chen, S.G.: A high-throughput radix-16 FFT processor with parallel and normal input/output ordering for IEEE 802.15.3c systems. IEEE Trans. Circuits Syst. I 59(8), 1752–1765 (2012)MathSciNetCrossRefGoogle Scholar
  50. 50.
    Huang, S.J., Chen, S.G., Garrido, M., Jou, S.J.: Continuous-flow parallel bit-reversal circuit for MDF and MDC FFT architectures. IEEE Trans. Circuits Syst. I 61(10), 2869–2877 (2014)CrossRefGoogle Scholar
  51. 51.
    Jang, J.K., Kim, M.G., Sunwoo, M.H.: Efficient scheduling scheme for eight-parallel MDC FFT processor. In: Proc. Int. SoC Design Conf., pp. 277–278 (2015)Google Scholar
  52. 52.
    Järvinen, T.: Systematic methods for designing stride permutation interconnections. Ph.D. thesis, Tampere Univ. of Technology (2004)Google Scholar
  53. 53.
    Järvinen, T., Salmela, P., Sorokin, H., Takala, J.: Stride permutation networks for array processors. In: Proc. IEEE Int. Applicat.-Specific Syst. Arch. Processors Conf., pp. 376–386 (2004)Google Scholar
  54. 54.
    Järvinen, T., Salmela, P., Sorokin, H., Takala, J.: Stride permutation networks for array processors. J. VLSI Signal Process. Syst. 49(1), 51–71 (2007)CrossRefGoogle Scholar
  55. 55.
    Jo, B.G., Sunwoo, M.H.: New continuous-flow mixed-radix (CFMR) FFT processor using novel in-place strategy. IEEE Trans. Circuits Syst. I 52(5), 911–919 (2005)MathSciNetCrossRefGoogle Scholar
  56. 56.
    Johnston, J.A.: Parallel pipeline fast Fourier transformer. In: IEE Proc. F Comm. Radar Signal Process., vol. 130, pp. 564–572 (1983)Google Scholar
  57. 57.
    Källström, P., Garrido, M., Gustafsson, O.: Low-complexity rotators for the FFT using base-3 signed stages. In: Proc. IEEE Asia-Pacific Conf. Circuits Syst., pp. 519–522 (2012)Google Scholar
  58. 58.
    Kim, M.G., Shin, S.K., Sunwoo, M.H.: New parallel MDC FFT processor with efficient scheduling scheme. In: Proc. IEEE Asia-Pacific Conf. Circuits Syst., pp. 667–670 (2014)Google Scholar
  59. 59.
    Kristensen, F., Nilsson, P., Olsson, A.: Flexible baseband transmitter for OFDM. In: Proc. IASTED Conf. Circuits Signals Syst., pp. 356–361 (2003)Google Scholar
  60. 60.
    Kumm, M., Hardieck, M., Zipf, P.: Optimization of constant matrix multiplication with low power and high throughput. IEEE Transactions on Computers PP(99), 1–1 (2017)Google Scholar
  61. 61.
    Lee, H.Y., Park, I.C.: Balanced binary-tree decomposition for area-efficient pipelined FFT processing. IEEE Trans. Circuits Syst. I 54(4), 889–900 (2007)CrossRefGoogle Scholar
  62. 62.
    Lee, J., Lee, H., in Cho, S., Choi, S.S.: A high-speed, low-complexity radix-24 FFT processor for MB-OFDM UWB systems. In: Proc. IEEE Int. Symp. Circuits Syst., pp. 210–213 (2006)Google Scholar
  63. 63.
    Li, C.C., Chen, S.G.: A radix-4 redundant CORDIC algorithm with fast on-line variable scale factor compensation. In: Proc. IEEE Int. Conf. Acoust. Speech Signal Process., vol. 1, pp. 639–642 (1997)Google Scholar
  64. 64.
    Li, N., van der Meijs, N.: A radix 22 based parallel pipeline FFT processor for MB-OFDM UWB system. In: Proc. IEEE Int. SOC Conf., pp. 383–386 (2009)Google Scholar
  65. 65.
    Li, S., Xu, H., Fan, W., Chen, Y., Zeng, X.: A 128/256-point pipeline FFT/IFFT processor for MIMO OFDM system IEEE 802.16e. In: Proc. IEEE Int. Symp. Circuits Syst., pp. 1488–1491 (2010)Google Scholar
  66. 66.
    Lin, C.H., Wu, A.Y.: Mixed-scaling-rotation CORDIC (MSR-CORDIC) algorithm and architecture for high-performance vector rotational DSP applications. IEEE Trans. Circuits Syst. I 52(11), 2385–2396 (2005)CrossRefGoogle Scholar
  67. 67.
    Lin, Y.W., Lee, C.Y.: Design of an FFT/IFFT processor for MIMO OFDM systems. IEEE Trans. Circuits Syst. I 54(4), 807–815 (2007)MathSciNetCrossRefGoogle Scholar
  68. 68.
    Liu, H., Lee, H.: A high performance four-parallel 128/64-point radix-24 FFT/IFFT processor for MIMO-OFDM systems. In: Proc. IEEE Asia Pacific Conf. Circuits Syst., pp. 834–837 (2008)Google Scholar
  69. 69.
    Liu, L., Ren, J., Wang, X., Ye, F.: Design of low-power, 1GS/s throughput FFT processor for MIMO-OFDM UWB communication system. In: Proc. IEEE Int. Symp. Circuits Syst., pp. 2594–2597 (2007)Google Scholar
  70. 70.
    Liu, X., Yu, F., Wang, Z.: A pipelined architecture for normal I/O order FFT. Journal of Zhejiang University - Science C 12(1), 76–82 (2011)CrossRefGoogle Scholar
  71. 71.
    Ma, Y., Wanhammar, L.: A hardware efficient control of memory addressing for high-performance FFT processors. IEEE Trans. Signal Process. 48(3), 917–921 (2000)CrossRefGoogle Scholar
  72. 72.
    Ma, Z.G., Yin, X.B., Yu, F.: A novel memory-based FFT architecture for real-valued signals based on a radix-2 decimation-in-frequency algorithm. IEEE Trans. Circuits Syst. II 62(9), 876–880 (2015)CrossRefGoogle Scholar
  73. 73.
    Macleod, M.D.: Multiplierless implementation of rotators and FFTs. EURASIP J. Appl. Signal Process. 2005(17), 2903–2910 (2005)zbMATHGoogle Scholar
  74. 74.
    Majumdar, M., Parhi, K.K.: Design of data format converters using two-dimensional register allocation. IEEE Trans. Circuits Syst. II 45(4), 504–508 (1998)CrossRefGoogle Scholar
  75. 75.
    Meher, P.K., Park, S.Y.: CORDIC designs for fixed angle of rotation. IEEE Trans. VLSI Syst. 21(2), 217–228 (2013)CrossRefGoogle Scholar
  76. 76.
    Meher, P.K., Valls, J., Juang, T.B., Sridharan, K., Maharatna, K.: 50 years of CORDIC: Algorithms, architectures, and applications. IEEE Trans. Circuits Syst. I 56(9), 1893–1907 (2009)MathSciNetCrossRefGoogle Scholar
  77. 77.
    Möller, K., Kumm, M., Garrido, M., Zipf, P.: Optimal shift reassignment in reconfigurable constant multiplication circuits. IEEE Trans. Comput.-Aided Design Integr. Circuits Syst. (2017). Accepted for publicationGoogle Scholar
  78. 78.
    Oh, J.Y., Lim, M.S.: New radix-2 to the 4th power pipeline FFT processor. IEICE Trans. Electron. E88-C(8), 1740–1746 (2005)CrossRefGoogle Scholar
  79. 79.
    Paeth, A.W.: A fast algorithm for general raster rotation. In: Proc. Graphics Interface, pp. 77–81 (1986)Google Scholar
  80. 80.
    Parhi, K.K.: Systematic synthesis of DSP data format converters using life-time analysis and forward-backward register allocation. IEEE Trans. Circuits Syst. II 39(7), 423–440 (1992)CrossRefGoogle Scholar
  81. 81.
    Park, S.Y., Yu, Y.J.: Fixed-point analysis and parameter selections of MSR-CORDIC with applications to FFT designs. IEEE Trans. Signal Process. 60(12), 6245–6256 (2012)MathSciNetCrossRefGoogle Scholar
  82. 82.
    Püschel, M., Milder, P.A., Hoe, J.C.: Permuting streaming data using RAMs. J. ACM 56(2), 10:1–10:34 (2009)MathSciNetCrossRefGoogle Scholar
  83. 83.
    Qureshi, F., Gustafsson, O.: Generation of all radix-2 fast Fourier transform algorithms using binary trees. In: Proc. Europ. Conf. Circuit Theory Design, pp. 677–680 (2011)Google Scholar
  84. 84.
    Qureshi, F., Gustafsson, O.: Low-complexity constant multiplication based on trigonometric identities with applications to FFTs. IEICE Trans. Fundamentals E94-A(11), 324–326 (2011)CrossRefGoogle Scholar
  85. 85.
    Reisis, D., Vlassopoulos, N.: Conflict-free parallel memory accessing techniques for FFT architectures. IEEE Trans. Circuits Syst. I 55(11), 3438–3447 (2008)MathSciNetCrossRefGoogle Scholar
  86. 86.
    Sánchez, M., Garrido, M., López, M., Grajal, J.: Implementing FFT-based digital channelized receivers on FPGA platforms. IEEE Trans. Aerosp. Electron. Syst. 44(4), 1567–1585 (2008)CrossRefGoogle Scholar
  87. 87.
    Serre, F., Holenstein, T., Püschel, M.: Optimal circuits for streamed linear permutations using RAM. In: Proc. ACM/SIGDA Int. Symp. Field-Programmable Gate Arrays, pp. 215–223. ACM (2016)Google Scholar
  88. 88.
    Shih, X.Y., Liu, Y.Q., Chou, H.R.: 48-mode reconfigurable design of SDF FFT hardware architecture using radix-32 and radix-23 design approaches. IEEE Trans. Circuits Syst. I 64(6), 1456–1467 (2017)CrossRefGoogle Scholar
  89. 89.
    Shukla, R., Ray, K.: Low latency hybrid CORDIC algorithm. IEEE Trans. Comput. 63(12), 3066–3078 (2014)MathSciNetCrossRefGoogle Scholar
  90. 90.
    Stone, H.: Parallel processing with the perfect shuffle. IEEE Trans. Comput. C-20(2), 153–161 (1971)CrossRefGoogle Scholar
  91. 91.
    Takagi, N., Asada, T., Yajima, S.: Redundant CORDIC methods with a constant scale factor for sine and cosine computation. IEEE Trans. Comput. 40(9), 989–995 (1991)MathSciNetCrossRefGoogle Scholar
  92. 92.
    Takala, J., Järvinen, T.: Stride Permutation Access In Interleaved Memory Systems. Domain-Specific Processors: Systems, Architectures, Modeling, and Simulation, S. Bhattacharyya, E. Deprettere and J. Teich. CRC Press (2003)CrossRefGoogle Scholar
  93. 93.
    Takala, J., Jarvinen, T., Sorokin, H.: Conflict-free parallel memory access scheme for FFT processors. In: Proc. IEEE Int. Symp. Circuits Syst., vol. 4, pp. 524–527 (2003)Google Scholar
  94. 94.
    Tang, S.N., Tsai, J.W., Chang, T.Y.: A 2.4-GS/s FFT processor for OFDM-based WPAN applications. IEEE Trans. Circuits Syst. II 57(6), 451–455 (2010)CrossRefGoogle Scholar
  95. 95.
    Tsai, P.Y., Lin, C.Y.: A generalized conflict-free memory addressing scheme for continuous-flow parallel-processing FFT processors with rescheduling. IEEE Trans. VLSI Syst. 19(12), 2290–2302 (2011)CrossRefGoogle Scholar
  96. 96.
    Tummeltshammer, P., Hoe, J.C., Püschel, M.: Time-multiplexed multiple-constant multiplication. IEEE Trans. Comput.-Aided Design Integr. Circuits Syst. 26(9), 1551–1563 (2007)CrossRefGoogle Scholar
  97. 97.
    Volder, J.E.: The CORDIC trigonometric computing technique. IRE Trans. Electronic Computing EC-8, 330–334 (1959)CrossRefGoogle Scholar
  98. 98.
    Voronenko, Y., Püschel, M.: Multiplierless multiple constant multiplication. ACM Trans. Algorithms 3, 1–39 (2007)MathSciNetCrossRefGoogle Scholar
  99. 99.
    Wang, J., Xiong, C., Zhang, K., Wei, J.: A mixed-decimation MDF architecture for radix- 2k parallel FFT. IEEE Trans. VLSI Syst. 24(1), 67–78 (2016)CrossRefGoogle Scholar
  100. 100.
    Wang, Z., Liu, X., He, B., Yu, F.: A combined SDC-SDF architecture for normal I/O pipelined radix-2 FFT. IEEE Trans. VLSI Syst. 23(5), 973–977 (2015)CrossRefGoogle Scholar
  101. 101.
    Wenzler, A., Luder, E.: New structures for complex multipliers and their noise analysis. In: Proc. IEEE Int. Symp. Circuits Syst., vol. 2, pp. 1432–1435 (1995)Google Scholar
  102. 102.
    Wold, E., Despain, A.: Pipeline and parallel-pipeline FFT processors for VLSI implementations. IEEE Trans. Comput. C-33(5), 414–426 (1984)CrossRefGoogle Scholar
  103. 103.
    Wu, C.S., Wu, A.Y.: Modified vector rotational CORDIC (MVR-CORDIC) algorithm and architecture. IEEE Trans. Circuits Syst. II 48(6), 548–561 (2001)CrossRefGoogle Scholar
  104. 104.
    Wu, C.S., Wu, A.Y., Lin, C.H.: A high-performance/low-latency vector rotational CORDIC architecture based on extended elementary angle set and trellis-based searching schemes. IEEE Trans. Circuits Syst. II 50(9), 589–601 (2003)CrossRefGoogle Scholar
  105. 105.
    Xia, K.F., Wu, B., Xiong, T., Ye, T.C.: A memory-based FFT processor design with generalized efficient conflict-free address schemes. IEEE Trans. VLSI Syst. 25(6), 1919–1929 (2017)CrossRefGoogle Scholar
  106. 106.
    Xing, Q., Ma, Z., Xu, Y.: A novel conflict-free parallel memory access scheme for FFT processors. IEEE Trans. Circuits Syst. II (2017)Google Scholar
  107. 107.
    Xudong, W., Yu, L.: Special-purpose computer for 64-point FFT based on FPGA. In: Proc. Int. Conf. Wireless Comm. Signal Process., pp. 1–3 (2009)Google Scholar
  108. 108.
    Yang, K.J., Tsai, S.H., Chuang, G.: MDC FFT/IFFT processor with variable length for MIMO-OFDM systems. IEEE Trans. VLSI Syst. 21(4), 720–731 (2013)CrossRefGoogle Scholar
  109. 109.
    Yang, L., Zhang, K., Liu, H., Huang, J., Huang, S.: An efficient locally pipelined FFT processor. IEEE Trans. Circuits Syst. II 53(7), 585–589 (2006)CrossRefGoogle Scholar
  110. 110.
    Yeh, W.C., Jen, C.W.: High-speed and low-power split-radix FFT. IEEE Trans. Signal Process. 51(3), 864–874 (2003)MathSciNetCrossRefGoogle Scholar
  111. 111.
    Yu, C., Yen, M.H.: Area-efficient 128- to 2048∕1536-point pipeline FFT processor for LTE and mobile WiMAX systems. IEEE Trans. VLSI Syst. 23(9), 1793–1800 (2015)MathSciNetCrossRefGoogle Scholar
  112. 112.
    Zheng, W., Li, K.: Split radix algorithm for length 6m DFT. IEEE Signal Process. Lett. 20(7), 713–716 (2013)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2019

Authors and Affiliations

  • Mario Garrido
    • 1
  • Fahad Qureshi
    • 2
  • Jarmo Takala
    • 3
  • Oscar Gustafsson
    • 1
  1. 1.Linköping UniversityLinköpingSweden
  2. 2.Tampere University of TechnologyTampereFinland
  3. 3.Department of Pervasive ComputingTampere University of TechnologyTampereFinland

Personalised recommendations