|
|
An Improved Scheme of Visual Words Description and Action Recognition Using Local Enhanced Distribution Information |
ZHANG Liang LU Mengmeng JIANG Hua |
(Key Laboratory of Advanced Signal and Image Processing, Civil Aviation University of China, Tianjin 300300, China) |
|
|
Abstract The traditional Bag-Of-Words (BOW) model easy causes confusion of different action classes due to the lack of distribution information among features. And the size of BOW has a large effect on recognition rate. In order to reflect the distribution information of interesting points, the position relationship of interesting points in local spatio-temporal region is calculated as the consistency of distribution features. And the appearance features are fused to build the enhanced BOW model. SVM is adopted for multi-classes recognition. The experiment is carried out on KTH dataset for single person action recognition and UT-interaction dataset for multi-person abnormal action recognition. Compared with traditional BOW model, the enhanced BOW algorithm not only has a great improvement in recognition rate, but also reduces the influence of BOW model’s size on recognition rate. The experiment results of the proposed algorithm show the validity and good performance.
|
Received: 08 April 2015
Published: 22 January 2016
|
|
Fund: The National Natural Science Foundation of China (61179045) |
Corresponding Authors:
ZHANG Liang
E-mail: stonemark@vip.163.com
|
|
|
|
[1] |
胡琼, 秦磊, 黄庆明. 基于视觉的人体动作识别综述[J]. 计算机学报, 2013, 36(12): 2512-2524. doi: 10.3724/SP.J.1016. 2013.02512.
|
|
HU Qiong, QIN Lei, and HUANG Qingming. Human action recognition review based on computer vision[J]. Journal of Computer, 2013, 36(12): 2512-2524. doi: 10.3724/SP.J. 1016.2013.02512.
|
[2] |
BEBAR A A and HEMAYED E E. Comparative study for feature detector in human activity recognition[C]. IEEE the 9th International conference on Computer Engineering Conference, Giza, 2013: 19-24. doi:10.1109/ICENCO.2013. 6736470.
|
[3] |
LI F and DU J X. Local spatio-temporal interest point detection for human action recognition[C]. IEEE the 5th International Conference on Advanced Computational Intelligence, Nanjing, 2012: 579-582. doi: 10.1109/ICACI. 2012.6463231.
|
[4] |
ONOFRI L, SODA P, and IANNELLO G. Multiple subsequence combination in human action recognition[J]. IEEE Journal on Computer Vision, 2014, 8(1): 26-34. doi:10.1049/iet-cvi.2013.0015.
|
[5] |
FOGGIA P, PERCANNELLA G, SAGGESE A, et al. Recognizing human actions by a bag of visual words[C]. IEEE International Conference on Systems, Man, and Cybernetics, Manchester, 2013: 2910-2915. doi:10.1109/SMC.2013.496.
|
[6] |
ZHANG X, MIAO Z J, and WAN L. Human action categories using motion descriptors[C]. IEEE 19th International Conference on Image Processing, Orlando, FL, 2012: 1381-1384. doi:10.1109/ICIP.2012.6467126.
|
[7] |
LI Y and KUAI Y H. Action recognition based on spatio-temporal interest point[C]. IEEE the 5th International
|
[8] |
Conference on Biomedical Engineering and Informatics, Chongqing, 2012: 181-185. doi:10.1109/BMEI.2012.6512972.
|
[9] |
REN H and MOSELUND T B. Action recognition using salient neighboring histograms[C]. IEEE the 20th International Conference on Image Processing, Melbourne, VIC, 2013: 2807-2811. doi:10.1109/ICIP.2013.6738578.
|
[10] |
COZAR J R, GONZALEZ-LINARES J M, GUIL N, et al. Visual words selection for human action classification[C]. International Conference on High Performance Computing and Simulation, Madrid, 2012: 188-194. doi: 10.1109/ HPCSim.2012.6266910.
|
[11] |
WANG H R, YUAN C F, HU W M, et al. Action recognition using nonnegative action component representation and sparse basis selection[J]. IEEE Transactions on Image Processing, 2014, 23(2): 570-581. doi:10.1109/TIP.2013. 2292550.
|
[12] |
BILINSKI P and BREMOND F. Contextual statistics of space-time ordered features for human action recognition[C]. IEEE the 9th International Conference on Advanced Video and Signal-based Surveillance, Beijing, 2012: 228-233. doi:10.1109/AVSS.2012.29.
|
[13] |
ZHANG L, ZHEN X T, and Shao L. High order co-occurrence of visualwords for action recognition[C]. IEEE the 19th International Conference on Image Processing, Orlando, FL, 2012: 757-760. doi:10.1109/ICIP.2012.6466970.
|
[14] |
SHAN Y H, ZHANG Z, ZHANG J, et al. Interest point selection with spatio-temporal context for realistic action recognition[C]. IEEE the 9th International Conference on Advanced Video and Signal-based Surveillance, Beijing, 2012: 94-99. doi:10.1109/AVSS.2012.43.
|
[15] |
TIAN Y and RUAN Q Q. Weight and context method for action recognition using histogram Intersection[C]. The 5th IET International Conference on Wireless, Mobile and Multimedia Networks, Beijing, 2013: 229-233. doi:10.1049/ cp.2013.2414.
|
[16] |
LAPTEV I and LIDEBERG T. Space-time interest points[C]. IEEE the 9th International Conference on Computer Vision, Nice, France, 2003: 432-439. doi:10.1109/ICCV.2003. 1238378.
|
[17] |
KLASER A, MARSZALEK M, and SCHMID C. A spatio- temporal descriptor based on 3D-gradients[C]. The 19th Conference on British Machine Vision and Pattern Recognition, Leeds, United Kingdom, 2008: 1-10.
|
|
|
|