Abstract:This paper proposes an improved method for Winograd algorithm to solve the problem that the existing methods of long sequences Fast Fourier Transform (FFT) on the TS201 processor does not take full account of the Cache’s miss influence on efficiency. The new method makes maximum use of the Cache’s advantages in reading and writing by optimizing the access method of rows and columns to avoid three explicitly matrix transposition, and hiding the twiddle factor multiplication by reconfiguration butterfly computation. Test results show that the performance of Cache-optimized implementation of FFT is significantly improved, and it can be used for fast acquisition of pulse-compression in radar system.
马潇, 高立宁, 刘腾飞, 金烨. 基于Cache优化的大点数FFT在TS201上的实现[J]. 电子与信息学报, 2013, 35(7): 1774-1778.
Ma Xiao, Gao Li-Ning, Liu Teng-Fei, Jin Ye. Cache-optimized Implementation of Long Sequences FFT on TS201. , 2013, 35(7): 1774-1778.