In this chapter new hardware designs are proposed for solving circular convolutions. VLSI convolution chips have been designed to accommodate various window sizes. The numbers of coefficient being handled is directly proportional to the number of systolic arrays. The computation for these convolution architectures are based on the one dimensional systolic array and the array elements can be computed by using MAC. Architecture for parallel processing of generalized one dimensional systolic arrays for linear and circular convolution is summarized which gives the enormously better response in terms of power and speed as compare to the conventional design. The circuit is optimized in terms of speed and power consumption by using CMOS and hybrid logic of CMOS-TG. Functionality of these circuits are designed and verified by using Verilog. The convolution operation is the heart of digital signal processing and digital image processing. However, the concept of computation for determining the convolution sum requires a number of steps that are tedious and slow to perform. The general approach for the determination of convolution sum is graphical method because of the visual insight into the convolution mechanism. Graphical convolution is very systematic to compute but is also very tedious and time consuming. The sliding tape method, is basically the same as the graphical approach but is a little less tedious to perform. Often, convolution is computed analytically, and tables of common convolution sums are available. Other approaches, such as an array form of the graphical method or specially structured tables of values of the sequences, may be used to compute the discrete convolution sum. These methods are not as conceptual as the graphical or sliding tape method but may make the computation less tedious. A convenient way to compute linear convolution of two N point sequences is to employ circular convolution, i.e. using properties of the Discrete Fourier Transform (DFT) or number theoretic transform (NTT). All of these methods require the use of 2N-1 point circular convolution to compute N point linear convolution of discrete time sequences. To improve the speed of this operation, alternative methods such as the right-angle circular convolution (RCC) have been proposed, and a relationship between this method and linear convolution is established. This makes it possible to compute a linear convolution of two discrete time sequences using an N point RCC instead of standard 2N-1 point circular convolution. To compute RCC, the modified Fermat number transform (FNT) was used, whereby some adjustments had to be made to split the N numbers from the computation of RCC into 2N-1 numbers required by linear convolution calculation methods. To implement the hardware architecture for linear and circular convolution of two N point sequences, we propose MAC based architecture, which uses systolic array for generating the convolution sum as required by the standard algorithm. The proposed architecture does not impose any limits on the method for calculation of convolution sum, as long as the chosen method does not introduce round-off errors.