A general radix-2 FFT algorithm was recently developed and implemented for Modern Single Instruction Multiple Data (SIMD) architectures. This algorithm (SIMD-FFT) was found to be faster than any scalar FFT implementation, and as well, than other FFT implementations that uses the SIMD architecture for complex 1D and 2D input data [1]. In this paper, the SIMD-FFT algorithm is extended to handle Multi-Dimensional input data; this new approach does not make use of matrix transposition. The results are compared against the FFTW for the 2D and 3D case. Overall, the SIMD-FFT was found to be faster for complex 2D input data (ranging from 82 % up to 343%), and as well, for complex 3D input data (ranging from 59.5 % up to 198%
Abstract—We present novel algorithms for computing discrete Fourier transforms with high performance...
Complex two-dimensional FFT's up to size 256 X 256 points were implemented on the Intel iPSC/System...
A new parallel pipelined feed forward architecture for real-time signal is proposed. A hardware orie...
The Fast Fourier Transform is probably one of the most studied algorithms of all time. New technique...
Click on the DOI link to access the article (may not be free)Conventional two dimensional fast Fouri...
We present a parallel FFT algorithm for SIMD systems following the "Transpose Algorithm" approach. T...
We present a parallel FFT algorithm for SIMD systems following the `Transpose Algorithm' approach. T...
Modern RISC processors provide a special instruction -- the fused multiplyadd (FMA) instruction \Si...
Abstract—This letter presents an efficient split vector-radix-2/8 fast Fourier transform (FFT) algor...
We describe an efficient algorithm for calculating Fast Fourier Transforms on matrices of arbitraril...
Abstract. This paper presents compiler technology that targets general purpose microprocessors augme...
A new on-chip implementation of Fast Fourier Transform (FFT) based on Radix 2 is presented. The pipe...
AbstractIt is proposed to enhance and simplify the programming of a two dimensional (2-D) torus (and...
The discrete Fourier transform (DFT) and discrete Hartley transform (DHT) play a crucial role in one...
Abstract-- The Discrete Fourier Transform (DFT) is used to transform the samples in time domain into...
Abstract—We present novel algorithms for computing discrete Fourier transforms with high performance...
Complex two-dimensional FFT's up to size 256 X 256 points were implemented on the Intel iPSC/System...
A new parallel pipelined feed forward architecture for real-time signal is proposed. A hardware orie...
The Fast Fourier Transform is probably one of the most studied algorithms of all time. New technique...
Click on the DOI link to access the article (may not be free)Conventional two dimensional fast Fouri...
We present a parallel FFT algorithm for SIMD systems following the "Transpose Algorithm" approach. T...
We present a parallel FFT algorithm for SIMD systems following the `Transpose Algorithm' approach. T...
Modern RISC processors provide a special instruction -- the fused multiplyadd (FMA) instruction \Si...
Abstract—This letter presents an efficient split vector-radix-2/8 fast Fourier transform (FFT) algor...
We describe an efficient algorithm for calculating Fast Fourier Transforms on matrices of arbitraril...
Abstract. This paper presents compiler technology that targets general purpose microprocessors augme...
A new on-chip implementation of Fast Fourier Transform (FFT) based on Radix 2 is presented. The pipe...
AbstractIt is proposed to enhance and simplify the programming of a two dimensional (2-D) torus (and...
The discrete Fourier transform (DFT) and discrete Hartley transform (DHT) play a crucial role in one...
Abstract-- The Discrete Fourier Transform (DFT) is used to transform the samples in time domain into...
Abstract—We present novel algorithms for computing discrete Fourier transforms with high performance...
Complex two-dimensional FFT's up to size 256 X 256 points were implemented on the Intel iPSC/System...
A new parallel pipelined feed forward architecture for real-time signal is proposed. A hardware orie...