Abstract
Filtering is an immense process in spectral analysis of signals. In designing of filters, window functions are usually used. In this paper, we present the variety of window functions based on the scale-free COordinate Rotation DIgital Computer (CORDIC) algorithm for the target angle which covers the complete coordinate space. To overcome the problem of more occupied area and speed, we present a study of a different design that is scale-free CORDIC algorithm-based window function architectures. The current paper presents the simulation and synthesis results of two designs which are coded in very high speed integrated circuit hardware description language (VHDL). The Xilinx 13.1 software is used for the simulation and synthesis of coded design, and also these designs are mapped into Virtex-5(XC5VLX20T-FF323) field-programmable gate array (FPGA) device.
1 Introduction
Window filtering techniques [1] are widely used and important methods in signal processing. These functions are used for both time and frequency-based signal processing. For finite impulse response filters (FIR), several window functions are developed depending upon the requirements like reduction of sidelobe, dynamic range. In the past few decades, hardware efficient VLSI architectures for window function generator were designed using lookup tables but these architectures occupied large space and have more latency and word length. By this technique, the length of window could not vary. The fixed length window functions are inefficient. Therefore, a flexible size window-based functions which are reconfigurable were implemented based on CORDIC algorithm [2, 3]. The CORDIC algorithm has various applications like design of digital filters [4], FFTs [5] and several window functions. Also, CORDIC is useful in the calculation of many transcendental algebraic functions, which can be used in various applications, VLSI design such as multiplication, division, hyperbolic tangent and sigmoid function. CORDIC is an efficient hardware and has simplicity as well as low computational complexity property. The major drawback of the CORDIC implementation is that it results in high latency or large expense of hardware of scale factor compensation network design. For the minimization of latency and reduction of numbers of iterations, parallel CORDIC architectures [6, 7] have been proposed. In parallel CORDIC architectures, the latency is reduced but the cost of hardware and time to implement the scale factor compensation is increased. The savings of hardware obtained by employing variable scale factor [8, 9] compensation network, but these methods increase the area or otherwise affect throughput or latency. Scale-free CORDIC processor [10,11,12] has been suggested for optimum solution for hardware area savings and less complex hardware as well as high throughput and low latency. Our proposed work deals with the study of scale-free CORDIC processor design with two different window functions of variable length. That is, in the proposed study, reconfigurable architectures have been used.
Rest of the paper is organized as follows: Sect. 2 presents overview of CORDIC algorithm. In Sect. 3, we present design aspects of the scale-free CORDIC processor. The design of scale-free CORDIC processor-based architecture of different window functions is given in Sect. 4. Section 5 presents simulation results along with the performance measures and finally the conclusions are given in Sect. 6.
2 CORDIC Algorithm
-
(A)
Basic CORDIC Algorithm:
A great scientist Jack. E. Volder is an inventor of an original CORDIC algorithm [13, 14]. It converts the rectangular coordinate s(x, y) to polar coordinates (R, θ). It is a shift and adds steps to perform the vector rotation. The basic CORDIC algorithm equations are:
If the rotation angle θ is divided into a set of small angles for rotation in a set of steps θ can be approximated by \( \theta = \sum\nolimits_{i = 0}^{n} {\delta i\theta i} \), where \( \delta_{i} = \left\{ {1, - 1} \right\},\,\delta i \) is the sign of rotation (+ve for counterclockwise and −ve for clockwise rotation). There are several admissible values that may be chosen for the rotation steps. If the iteration is chosen as \( \theta_{i} = \tan^{ - 1} 2^{ - i} \). This value is selected because it is easier to implement in hardware; therefore, the new coordinates after each rotation \( \left( {x_{i + 1 } ,\,y_{i + 1} } \right) \) can be expressed as
Defined as the scale factor.
-
(B)
Unified CORDIC Algorithm:
In 1971, J. S. Walther reinvented the generalized CORDIC algorithm [15, 16] having three different trajectories like circular (m = 1), linear (m = 0) and hyperbolic (m = −1). For each trajectory, two rotation directions are included (vectoring and rotation). For vectoring, a vector with starting coordinates \( \left( {x_{0} ,\,y_{0} } \right) \) is rotated in such a way that the vector finally lies on the abscissa by iteratively converging \( y_{n} \) to zero. For a rotation, a vector with starting coordinates \( \left( {x_{0} ,\,y_{0} } \right) \) is rotated by an angle \( \theta_{0} \) in such a way that the final value of the angle register converges to zero. The unified CORDIC algorithm is defined as follows:
where \( m = \left\{ {\begin{array}{*{20}c} { - 1} & {for\;hyperbolic} \\ 0 & {for\;linear} \\ 1 & {for\;circular} \\ \end{array} } \right. \)
There are six operational modes exist by using different combination of three trajectories and two modes, and they are summarized in Table 1.
Direct computation—multiplication—\( x \times y \), Division—\( \frac{y}{x} \).
Trigonometric functions-\( \sin z,\,\cos z,\,{ \tan }z \), \( \text{sinhz,}\,\text{coshz},\,\text{tanhz,}\tan^{ - 1} {\text{z}},\,\tanh^{ - 1} {\text{z}} \).
Additional function may be computed by choosing appropriate combination of multiples modes of operation and appropriate initialization.
-
(C)
Scale-free CORDIC Algorithm
There are several improvements in CORDIC algorithm. For improvements of architecture performance and reduction of cost, an abundance of development has been established in the area of algorithm design and advancement of architecture. For enhancement of throughput the parallel and pipelined CORDIC architectures are preferred. The pipelined scaling-free CORDIC [10,11,12] is a very enormous development in the research area of upgradation of the CORDIC algorithm. The Taylor series-based scale-free CORDIC algorithm is a great invention in the field of CORDIC algorithm improvements [17]. The rotation matrix for scaling-free CORDIC is given as:
The sine and cosine approximated to
The conventional CORDIC processor gives two direction rotations but the Taylor series-based scale-free CORDIC processor gives only one direction rotation. The Taylor series of sine and cosine terms defined as
-
(D)
Window Filtering Techniques
During spectral analysis, the input signals are to be truncated to fit a finite observation window according to the length of FFT processor. In frequency domain, there are several phenomena occur like picket fence effect and spectral leakage due to the direct truncation by using rectangular window. There are some different window functions by which we reduce these effects. Window filtering is a popular process for limiting any signals to small-time segments in a desired fields. The most common accessible windowing techniques are rectangular, Gaussian, Hamming, Hanning, Blackman-Harris and Kaiser. The assortment of the available windows depends on the spectral characteristics desired by the applications. As given below, the equations explain the Hanning, Hamming and the Blackman window functions [18]
where N is the window length
where \( \upalpha +\upbeta = 1 \).
To maximize sidelobe cancellation, the values of α and β are determined. For Hamming window, the coefficients are calculated as α = 25/46 and β = 21/46.
where α0 + α1 + α2 = 1. The Blackman window with coefficients \( \alpha_{0} = 0.42,\,\alpha_{1} = 0.5 \) and \( \alpha_{2} = 0.08 \).
3 Design Aspects of Scale-Free CORDIC Processor
Decomposition of angle of rotation into micro-rotations in conventional CORDIC, the angle of rotation is used as follows: (i) the elementary angles are defined according to the \( 2^{ - i} \) where i is the no iteration and ROM is used as a storage circuit for the elementary angles, (ii) the micro-rotation corresponding to all the elementary angles are performed in clockwise or anti-clockwise and (iii) each elementary angle is non-repeated, but in scale-free CORDIC processor, the micro-rotations are rotated in only one direction with multiple times corresponding to the initial shifts, and for other shifts, non-repeated iterations are included.
-
(a)
For elimination of the ROM which is used for storage of elementary angles and for simplification of the hardware define the elementary angles [19] as: \( \alpha i = 2^{{ - s_{i} }} \) where \( s_{i} \) is the number of shifts for ith iteration.
-
(b)
The most significant one location represents the bit position of the one (1) in an input string of bits starting from most significant bit (MSB). The MSO location identifier (MSO-LI) generates an n-bit output for a \( 2^{n} \) bit input string. It is used for finding the shift index. \( s_{i} = N - M \), N is the word length of the input data and M is the location of the most significant bit (one) in N input string.
-
(c)
The order of approximation of Taylor series decides the largest elementary angle. The basic shift and the largest elementary angle for third order of approximation are to be:
$$ s_{b} = \left\lfloor {\frac{l - \log 2(4!)}{4}} \right\rfloor $$(19)$$ \alpha_{max} = 2^{{ - s_{b} }} $$(20)where l is the word length. For 16-bit word length, \( s_{b} = \left\lfloor {2.854} \right\rfloor \). Depending upon the desired accuracy, one can either select \( s_{b} = 2 \) or \( s_{b} = 3 \). Any rotation angle θ is expressed as:
$$ \theta = n_{1} \,.\,\alpha_{max} + n_{2} \,.\,{\sum }\alpha_{si} $$(21)where \( s_{i} \ge s_{b} \) and \( {\text{n}} = n_{1} + n_{2} \), n is the total number of iterations ‘n’ is a constant. The number of frequentness for third-order Taylor series approximation is seven.
For designing of scale-free CORDIC processor and the micro-rotation sequence generation, we take input angle to be rotated \( \theta_{i} \) and most significant ones bit location is represented by ML (location identifier). If ML = 15, then elementary angle α = 0.25 radians, shift \( {\text{si}} = 2,\, \theta_{i + 1} = \theta i - \alpha \). If ML is other than 15, then shift \( si = 16 - ML\;and\;\theta_{i + 1} = \theta i\;with\;\theta i[ML] = \,^{{\prime }} 0^{{\prime }} \).
Table 2 shows that the elementary angles corresponding to the basic shift values.
The percentage error for the sin and cos is indistinguishable from the range \( \left( {0,\pi /4} \right) \). So the maximum angle of rotation handled by micro-rotation sequence generation lies in the range \( \left( {0,\pi /4} \right) \).
The following points are important for the designing of micro-rotation sequence generator.
-
(i)
For \( \left( {N - MSOB_{location} < s_{b} } \right) \). Then the shift index would be used corresponding the highest elementary angle \( \alpha_{max} = 0.25 \) radians with shift index = 2.
-
(ii)
For \( \left( {N - MSOB_{location} \ge s_{b} } \right) \). The highest elementary angle \( \left( {\alpha_{si} } \right) \) would be employed for the CORDIC iteration corresponding to the \( s_{i} = 16 - M \).
The third-order Taylor series augmentation of sine–cosine functions gives the revolving matrix for the proposed architecture. In complete scale-free CORDIC algorithm for simplification of equations of rotation matrix, the Taylor series coefficient!3 is shifted by 23.
Figure 1 shows the coordinate calculation unit by which calculate the \( x_{i} \) and \( y_{i} \) value. Shift index calculation \( s_{i} \) unit shown in Fig. 2. Shift index calculation depends on the elementary angles. The elementary angles’ calculation or micro-rotation sequence generator unit is shown in Fig. 3.
4 Architecture for Window Functions
For implementation of windows functions, we use the pipelined architecture. We have designed the window architecture for 16-bit output width. Here we designed the window functions by using circular CORDIC processor, linear CORDIC processor and angle generator circuit and these window functions are also designed by using circular CORDIC processor, window coefficient multiplier which is designed by using booth multiplier. Figure 4 shows the block diagram of Angle Generator Unit. Figures 5 and 6 show the block diagram for generating different window functions. The different window functions are depending on the window select pins as the Hanning (ws0 = 0, ws1 = 0), Hamming (ws0 = 1, ws1 = 0), Blackman (ws0 = 0, ws1 = 1) [3, 18] window families. The circuit is the combination of blocks of angle generator unit (AGU), window coefficient multiplier (WCM), circular CORDIC processor (CCP) and first input first output register. Angle generator unit generates two angles \( \uptheta = \frac{2\pi n}{N} \) and \( 2\theta = \frac{4\pi n}{N} \). For multiplication of the window coefficient used a linear CORDIC which is based on conventional CORDIC algorithm or optimized shift-add network which is designed using booth multiplier. CORDIC processor which is in rotation mode and circular trajectory is employed for producing the cosine terms, and it is used in the window functions equations.
Angle generator unit produces two angles for evaluating the window functions.
where N is a multiple of 2 such that \( {\text{N}} = 2^{M} \)
The difference between the consecutive values of \( \theta \) is given by
For \( {\text{N}} = 2^{M} \)
Using binomial theorem (BT), we simplify to
The CCP unit is designed for the target angle range \( \left[ {0, \pi /4} \right] \). The range of target angle is enhanced by using the octant symmetry which is shown in Fig. 7 and Table 3 shows the initial coordinate values for enhancement of the angle of target angle.
5 FPGA Implementation Results
Scale-free CORDIC algorithm-based window functions architectures designed using linear CORDIC and circular CORDIC processor and also by using add-shift network and circular CORDIC processor. These are designed by using Xilinx13.1 VHDL module and are mapped into Virtex-5(XC5VLX20T-FF323) device. Table 4 shows that for 16-bit implementation, the first design consumes 1128 slices and 7674 4- input LUTs, with a maximum operating frequency 70.598 MHz. The total delay is 50.786 ns. In this design, logical delay is 8.850 ns and route delay is 41.936 ns. The second design consumes 1098 slices and 8119 4-input LUTs, with a maximum operating frequency 70.961 MHz. The total delay is 2.862 ns in which 2.54 ns is for logical delay and 0.286 ns for route delay.
5.1 Area
In the design with linear and circular CORDIC processor, seven, 16-bit adder/subtractor and 261 registers were used. The number of latches, comparators and multiplexers used is 94,803 and 150, respectively. The number of XOR logic gate used is 12177.
Device usage summary: Selected Device: 5vlx20tff323-2
Slice Logic Utilization: (a) Number of Slice Registers: 2137 out of 12480-17% (b) Number of Slice LUTs: 7703 out of 12480-61% (c) Number used as Logic: 7703 out of 12480-61%.
In the design with add-shift network and circular CORDIC processor, 106, 16-bit adder/subtractor and 98, 32-bit adder/subtractor were used. The numbers of registers and latches used are 1232 and 142, respectively. The comparators are 803. The number of XOR logic gate is 4752.
Device usage summary: Selected Device: 5vlx20tff323-2. Slice Logic Utilization: (a) Number of Slice Registers: 3146 out of 12480-25% (b) Number of Slice LUTs: 8011 out of 12480-64% (c) Number used as Logic: 8011 out of 12480-64%.
5.2 Latency and Delay
The throughput of all the architecture is equal; it is one data/clock cycle. Latency is the number of iteration of the pipelined CORDIC processor. So it is different for both the architecture. In the first design, there are two circular and three linear CORDIC processors. So the total pipeline stages for first design are 26, while in the second design the total pipeline stages are only 10. This shows that the latency is low in second architecture as compared to the first architecture, and also, the total delay in the second architecture is less as compared to the first architecture.
6 Conclusion
In this paper, we performed comparative study of the two different types of window function generator one is designed using circular CORDIC processor and linear CORDIC processor. Another one is designed using circular CORDIC processor and add-shift network. Add-shift network is designed using booth multiplier. We observe that the total delay is comparatively small in the proposed architecture, i.e. the design with CORDIC processor and add-shift network. This is due to the use of add-shift network reduces the number of pipelining stages which results in the number of iteration and latency. Further, we observe that all the operations of multiplications could be performed directly by the use of add-shift network with booth multiplier.
References
Parhi, K. K. (1999). VLSI digital signal processing systems. Wiley.
Ray, K. C., & Dhar, A. S. (2006). CORDIC–based unified VLSI architecture for implementing window functions for real time spectral analysis. IEEE Proceedings: Circuits, Devices and Systems, 153(6), 539–544.
Ray, K. C., & Dhar, A. S. (2008). High throughput VLSI architecture for Blackman windowing in real spectral analysis. Journal of Computers, 3(5), 54–59.
Vaidyanathan, P. P. (1985). A unified approach to orthogonal digital filters and wave digital filters based on the LBR two- pair extraction. IEEE Transactions on Circuits and Systems I, CAS-32, 673–686.
Banerjee, A., Dhar, A. S., & Banerjee, S. (2001). FPGA realization of a CORDIC based FFT processor for biomedical signal processing. Microprocessors and Micro System, 25(3), 131–142.
Gisuthan, B., & Srikanthan, T. (2000). FLAT CORDIC: A unified architecture for high speed generation of trigonometric and hyperbolic functions. In Proceedings of the 43rd IEEE Midwest Symposium on Circuits and Systems, Lansing MI (pp. 1414–1417).
Juang, T. B., Hsiao, S. F., & Tsai, M. Y. (2004). Para-CORDIC: Parallel CORDIC rotation algorithm. IEEE Transactions on Circuits and Systems I, 51(8), 1515–1524.
Lin, C. H., & Wu, A. Y. (2005). Mixed-scaling-rotation-CORDIC (MSR-CORDIC) algorithm and architecture for high-performance vector rotational DSP applications. IEEE Transactions on Circuits and Systems I, 52(11), 2385–2396.
Sumanasen, M. G. B. (2008). A scale factor correction scheme for the CORDIC algorithm. IEEE Transactions on Computers, 57(8), 1148–1152.
Maharatna, K., Troya, A., Banerjee, S., & Grass, E. (2004). Virtually scaling free adaptive CORDIC rotator. IEEE Proceedings Computers and Digital Techniques, 151(6), 448–456.
Maharatna, K., & Banerjee, S. (2005). Modified virtually scaling free adaptive CORDIC rotator algorithm and architecture. IEEE Transactions on Circuits and Systems for Video Technology, 15(11), 1463–1474.
Jaime, F. J., Sanchez, M. A., Hormigo, J., Villalba, J., & Zapata, E. L. (2010). Enhanced scaling free CORDIC. IEEE Transactions on Circuits and Systems Video Technology, 57(7), 1654–1662.
Volder, J. E. (1959). The CORDIC trigonometric Computing technique. IRE Transactions on Electronic Computers, 8(3), 330–334.
Volder, J. E. (2000). The birth of CORDIC. Journal of VLSI Signal Processing, 25(2), 101–105.
Walther, J. S. (1971). A unified algorithm for elementary functions. In Proceedings of AFIPS Spring Joint Computer Conference (pp. 379–385).
Walther, J. S. (2000). The story of unified CORDIC. Journal of VLSI Signal Processing, 25(2), 107–112.
Meher, P. K., Valls, J., Juang, T. B., Sridhara, K., & Maharatna, K. (2009). 50 years of CORDIC algorithms, architectures and applications. IEEE Transactions Circuits and Systems I, 56(9), 1893–1907.
Aggarwal, S., Khare. K. (2012). Redesigned-scale-free CORDIC algorithm based FPGA implementation of window functions to minimize area and latency. International Journal of Reconfigurable Computing, 2012(185784), 1–8.
Aggarwal, S., Meher, P. K., & Khare, K. (2013). Scale-free hyperbolic CORDIC processor and its application to waveform generation. IEEE Transactions on Circuits and System-I Regular Papers, 60(2), 314–326.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Rai, S., Srivastava, R. (2019). FPGA Realization of Scale-Free CORDIC Algorithm-Based Window Functions. In: Khare, A., Tiwary, U., Sethi, I., Singh, N. (eds) Recent Trends in Communication, Computing, and Electronics. Lecture Notes in Electrical Engineering, vol 524. Springer, Singapore. https://doi.org/10.1007/978-981-13-2685-1_24
Download citation
DOI: https://doi.org/10.1007/978-981-13-2685-1_24
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-2684-4
Online ISBN: 978-981-13-2685-1
eBook Packages: EngineeringEngineering (R0)