|
|
Acceleration Performance Study of Convolutional Neural Network Based on Split-radix-2/(2a) FFT Algorithms |
WU Jiasong①②④ DA Zhen①④ WEI Liming①④ SENHADJI Lotfi②③④ SHU Huazhong①②④ |
①(The Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, Nanjing 210096, China)
②(Institut National de la Santé et de la Recherche Médicale U 1099, Rennes 35000, France)
③(Laboratoire Traitement du Signal et de l’Image, Université de Rennes 1, Rennes 35000, France)
④(Centre de Recherche en Information Biomédicale Sino-français, Nanjing 210096, China) |
|
|
Abstract Convolution Neural Networks (CNN) make breakthrough progress in many areas recently, such as speech recognition and image recognition. A limiting factor for use of CNN in large-scale application is, until recently, their computational expense, especially the calculation of linear convolution in spatial domain. Convolution theorem provides a very effective way to implement a linear convolution in spatial domain by multiplication in frequency domain. This paper proposes an unified one-dimensional FFT algorithm based on decimation-in-time split- radix-2/(2a), in which a is an arbitrary natural number. The acceleration performance of convolutional neural network is studied by using the proposed FFT algorithm on CPU environment. Experimental results on the MNIST database and Cifar-10 database show great improvement when compared to the direct linear convolution based CNN with no loss in accuracy, and the radix-2/4 FFT gets the best time savings of 38.56% and 72.01% respectively. Therefore, it is a very effective way to realize linear convolution operation in frequency domain.
|
Received: 12 April 2016
Published: 29 December 2016
|
|
Fund: The National Natural Science Foundation of China (61201344, 61271312, 61401085), The Special Research Fund for the Doctoral Program of Higher Education (20120092120036) |
Corresponding Authors:
WU Jiasong
E-mail: jswu@seu.edu.cn
|
|
|
|
[1] |
HINTON G E and SALAKHUTDINOV R R. Reducing the dimensionality of data with neural networks[J]. Science, 2006, 313: 504-507. doi: 10.1126/science.1127647.
|
[2] |
HINTON G E, OSINDERO S, and TEH Y W. A fast learning algorithm for deep belief nets[J]. Neural Computation, 2006, 18(7): 1527-1554. doi: 10.1162/neco.2006.18.7.1527.
|
[3] |
BENGIO Y, COURVILLE A, and VINCENT P. Representation learning: A review and new perspectives[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(8): 1798-1828. doi: 10.1109/TPAMI. 2013.50.
|
[4] |
LECUN Y, BENGIO Y, and HINTON G E. Deep learning[J]. Nature, 2015, 521(7553): 436-444. doi: 10.1038/nature14539.
|
[5] |
DENG L and YU D. Deep learning: Methods and applications[J]. Foundations and Trends in Signal Processing, 2014, 7(3): 197-387.
|
[6] |
LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324. doi: 10.1109/5.726791.
|
[7] |
KRIZHEVSKY A, SUTSKEVER I, and HINTON G E. Imagenet classification with deep convolutional neural networks[C]. Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA, 2012: 1097-1105.
|
[8] |
SZEGEDY C, LIU W, JIA Y Q, et al.. Going deeper with convolutions[C]. IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 2015: 1-9.
|
[9] |
JADERBERG M, VEDALDI A, and ZISSERMAN A. Speeding up convolutional neural networks with low rank expansions[J]. Computer Science, 2014, 4(4): XIII.
|
[10] |
LIU B, WANG M, FOROOSH H, et al. Sparse Convolutional neural networks[C]. IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, United States, 2015: 806-814.
|
[11] |
VASILACHE N, JOHNSON J, MATHIEU M, et al. Fast convolutional nets with fbfft: A GPU performance evaluation [C]. International Conference on Learning Representations, San Diego, CA,USA, 2015: 1-17.
|
[12] |
COOLEY J W and TUKEY J W. An algorithm for the machine calculation of complex Fourier series[J]. Mathematics of Computation, 1965, 90(19): 297-301. doi: 10.2307/2003354.
|
[13] |
DUHAMEL P and VETTERLI M. Fast Fourier transforms: A tutorial review and a state of the art [J]. Signal Processing, 1990, 19(4): 259-299. doi: 10.1016/0165-1684(90)90158-U.
|
[14] |
WINOGRAD S. On computing the discrete Fourier transform[J]. Proceedings of the National Academy of Sciences of the United States of America, 1976, 73(4): 1005-1006. doi: 10.1073/pnas.73.4.1005.
|
[15] |
KOLBA D P and PARKS T W. A prime factor algorithm using high-speed convolution[J]. IEEE Transactions on Acoustics Speech & Signal Processing, 1977, 25(4): 281-294. doi: 10.1109/TASSP.1977.1162973.
|
[16] |
DUHAMEL P and HOLLMANN H. Implementation of Split-radix FFT algorithms for complex, real, and real symmetric data[C]. IEEE International Conference on Acoustics, Speech, and Signal Processing, Tampa, FL, USA, 1985: 285-295.
|
[17] |
BOUGUEZEL S, AHMAD M O, and SWAMY M N S. A general class of split-radix FFT algorithms for the computation of the DFT of length-2m [J]. IEEE Transactions on Signal Processing, 2007, 55(8): 4127-4138. doi: 10.1109/ TSP.2007.896110.
|
[18] |
BI G, LI G, and LI X. A unified expression for split-radix DFT algorithms[C]. IEEE International Conference on Communications, Circuits and Systems, Chengdu, China, 2010: 323-326.
|
[19] |
FARHANG BOROUJENY B and LIM Y C. A comment on the computational complexity of sliding FFT[J]. IEEE Transactions on Circuits and Systems II Analog and Digital Signal Processing, 1992, 39(12): 875-876. doi: 10.1109/82. 208583.
|
[20] |
PARK C S and KO S J. The hopping discrete Fourier transform[J]. IEEE Signal Processing Magazine, 2014, 31(2): 135-139. doi: 10.1109/MSP.2013.2292891.
|
[21] |
GOUK H G and BLAKE A M. Fast sliding window classification with convolutional neural networks[C]. Proceedings of the 29th International Conference on Image and Vision Computing, New Zealand, 2014: 114-118.
|
[22] |
RAO K R, KIM D N, and HWANG J J. Fast Fourier Transform: Algorithms and Applications[M]. Berlin: Springer Science & Business Media, 2011: 5-6.
|
[23] |
FRIGO M and JOHNSON S G. FFTW: An adaptive software architecture for the FFT[C]. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Seattle, WA, USA, 1998: 1381-1384.
|
[24] |
TAKAHASHI D. FFTE: A fast fourier transform package [OL]. http://www.ffte.jp/, 2014.2.
|
[25] |
SERMANET P, EIGEN D, ZHANG X, et al. Overfeat: Integrated recognition, localization and detection using convolutional networks[OL]. arXiv:1312.6229. 2013: 1-16.
|
[26] |
ARNDT J. Fxtbook[OL]. http://www.jjj.de/fxt/#fxtbook. 2015.1.
|
[27] |
LECUN Y, CORTES C, and CHRISTOPHER J C. The MNIST database of handwritten digits, Burges[OL]. http:// yann.lecun.com/exdb/mnist/, 2015.5.
|
[28] |
KRIZHEVSKY A, NAIR V, and HINTON G. Cifar-10 dataset[OL]. http://www.cs.toronto.edu/~kriz/cifar.html, 2009.1.
|
|
|
|