Title
AutoFFT: a template-based FFT codes auto-generation framework for ARM and X86 CPUs
Abstract
The discrete Fourier transform (DFT) is widely used in scientific and engineering computation. This paper proposes a template-based code generation framework named AutoFFT that can automatically generate high-performance fast Fourier transform (FFT) codes. AutoFFT employs the Cooley-Tukey FFT algorithm, which exploits the symmetric and periodic properties of the DFT matrix as the outer parallelization framework. To further reduce the number of floating-point operations of butterflies, we explore more symmetric and periodic properties of the DFT matrix and formulate two optimized calculation templates for prime and power-of-two radices. To fully exploit hardware resources, we encapsulate a series of optimizations in an assembly template optimizer. Given any DFT problem, AutoFFT automatically generates C FFT kernels using these two templates and transfers them to efficient assembly codes using the template optimizer. Experiments show that AutoFFT outperforms FFTW, ARMPL, and Intel MKL on average across all FFT types on ARMv8 and Intel x86-64 processors.
Year
DOI
Venue
2019
10.1145/3295500.3356138
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
Keywords
Field
DocType
AutoFFT, DFT, FFT, code generation, template
x86,Computer science,Parallel computing,Code generation,Fast Fourier transform,Discrete Fourier transform,Template,Periodic graph (geometry),Computation,DFT matrix
Conference
ISBN
Citations 
PageRank 
978-1-4503-6229-0
2
0.39
References 
Authors
0
7
Name
Order
Citations
PageRank
Zhihao Li1175.10
Haipeng Jia2222.20
Yunquan Zhang332743.92
Tun Chen420.39
Liang Yuan54512.85
Luning Cao620.39
Xiao Wang7929.26