Novel architectures leveraging long and variable vector lengths like the NEC SX-Aurora or the vector extension of RISCV are appearing as promising solutions on the supercomputing market. These architectures often require re-coding of scientific kernels. For example, traditional implementations of algorithms for computing the fast Fourier transform (FFT) cannot take full advantage of vector architectures. In this paper, we present the implementation of FFT algorithms able to leverage these novel architectures. We evaluate these codes on NEC SX-Aurora, comparing them with the optimized NEC libraries. We present the benefits and limitations of two approaches of RADIX-2 FFT vector implementations. We show that our approach makes better use of t...
Mathematical software for the Fast Fourier Transform We present a library for computing the Fast Fo...
We present a MPI based software library for computing the fast Fourier transforms on massively paral...
This letter presents an efficient split vector-radix-2/8 fast Fourier transform (FFT) algorithm. The...
Novel architectures leveraging long and variable vector lengths like the NEC SX-Aurora or the vector...
Many traditional algorithms for computing the fast Fourier transform (FFT) on conventional computers...
In this paper we extend a custom FFT vector architecture by adding multiple lane capabilities and st...
This paper presents the fastest fast Fourier transform (FFT) hardware architectures so far. The arch...
The emergence of streaming multicore processors with multi-SIMD (single-instruction multiple-data) a...
Abstract. The emergence of streaming multicore processors with multi-SIMD architectures and ultra-lo...
The Fast Fourier Transform is probably one of the most studied algorithms of all time. New technique...
The native implementation of the N-point digital Fourier Transform involves calculating the scalar p...
AbstractThe development of the fast Fourier transform (FFT) and its numerous variants in the past 30...
this paper point to software. Furthermore a simple tutorial on FFTs is presented there without expli...
In recent years, the SC FFT architecture has become popular for processing serial data. It requires ...
We describe an efficient algorithm for calculating Fast Fourier Transforms on matrices of arbitraril...
Mathematical software for the Fast Fourier Transform We present a library for computing the Fast Fo...
We present a MPI based software library for computing the fast Fourier transforms on massively paral...
This letter presents an efficient split vector-radix-2/8 fast Fourier transform (FFT) algorithm. The...
Novel architectures leveraging long and variable vector lengths like the NEC SX-Aurora or the vector...
Many traditional algorithms for computing the fast Fourier transform (FFT) on conventional computers...
In this paper we extend a custom FFT vector architecture by adding multiple lane capabilities and st...
This paper presents the fastest fast Fourier transform (FFT) hardware architectures so far. The arch...
The emergence of streaming multicore processors with multi-SIMD (single-instruction multiple-data) a...
Abstract. The emergence of streaming multicore processors with multi-SIMD architectures and ultra-lo...
The Fast Fourier Transform is probably one of the most studied algorithms of all time. New technique...
The native implementation of the N-point digital Fourier Transform involves calculating the scalar p...
AbstractThe development of the fast Fourier transform (FFT) and its numerous variants in the past 30...
this paper point to software. Furthermore a simple tutorial on FFTs is presented there without expli...
In recent years, the SC FFT architecture has become popular for processing serial data. It requires ...
We describe an efficient algorithm for calculating Fast Fourier Transforms on matrices of arbitraril...
Mathematical software for the Fast Fourier Transform We present a library for computing the Fast Fo...
We present a MPI based software library for computing the fast Fourier transforms on massively paral...
This letter presents an efficient split vector-radix-2/8 fast Fourier transform (FFT) algorithm. The...