Efficient Hardware Implementation of Discrete Fourier Transformation Algorithm
using Circular Convolution Techniques
Transformation (DFT) algorithm design based on circular convolution techniques This chapter proposes an efficient hardware implementation module for Discrete Fourier. The proposed design serves the numerical properties of the DFT coefficients. Based on the symmetrical properties of the twiddle factor coefficients, the ROM size is reduced as comparison to the conventional design. The IEEE- 754 single precision format is considered for the implementation of the twiddle factors. The propagation delay and power consumption of the conventional floating point multiplier and adder are eliminated by using Egyptian multiplier. Finally fully functional transistor level implementation of the 16-point DFT processor has been implemented. The functionality of these circuits has been experimented; switching power and the propagation delay of the circuits have been calculated by 90nm standard CMOS technology through the spice spectre.
Discrete Fourier Transformation (DFT) plays a key role in the field of digital signal and image processing, data compressions, high speed broadband communications etc. Because of their high computational complexity and derivation, it is inevitable for the researchers to implement the hardware efficient module for the DFT. Several algorithms have been proposed by various researchers to implement DFT [1-8]; but those aren’t much suitable for the VLSI implementation. Systolic array based architectures [5- 7] are VLSI oriented, because of its modularity and regularity, but the area consumption for the systolic arrays are large. In the existing DFT using systolic array based design, multipliers are the fundamental computing elements in PE’s. But multipliers consume a large silicon area; the limited chip size puts a severe limitation to the allowable number of PE’s. To enable constant multiplications, the memory-based designs  proposed cyclic convolution based architecture. The cyclic convolution-based design has the features of simple I/O behavior and removing data redundancy in the DFT coefficients. In this paper a hardware-efficient DFT architecture is proposed. The proposed techniques reformulate the transform equations into cyclic convolution form to make each DFT output have the same computation kernels instead of multiple ones. To implement the hardware architecture for circular convolution of two N point sequences, we propose MAC based architecture, which uses systolic array for generating the convolution sum as required by the standard algorithm. The proposed architecture does not impose any limits on the method for calculation of convolution sum, as long as the chosen method does not introduce round-off errors. For the multiplication and addition purpose we have implemented EGYPTIAN techniques. The EGYPTIAN multiplication is simply replaced by the addition of N EGYPTIAN partial products. When compared with existing methods such as the direct method or the strength reduction technique, the proposed approach results not only in simplified arithmetic operations, but also in a regular array-like structure. Second, the redundancy used in the EGYPTIAN representation allows the addition of partial products to be performed in parallel without carry propagation from the LSB through to the MSB .
The proposed DFT architecture is fully parameterized, so the architecture can be extended up-to N point DFT. The functionality of the proposed method is implemented and checked using spice spectre. Transistor level implementation for performances calculation of the proposed method have been calculated by spice spectre using existing 90 nm standard CMOS technology and compared with the other design. Comparisons with previous designs show that the proposed design has less hardware cost and propagation delay is reduced around 50% as comparison to Guo architecture.