局部分布信息增强的视觉单词描述与动作识别

doi:10.11999/JEIT150410

摘要
图/表
参考文献(17)
相关文章 (15)

全文: PDF (4276 KB)
输出: BibTeX | EndNote (RIS)

摘要

传统的单词包(Bag-Of-Words, BOW)算法由于缺少特征之间的分布信息容易造成动作混淆，并且单词包大小的选择对识别结果具有较大影响。为了体现兴趣点的分布信息，该文在时空邻域内计算兴趣点之间的位置关系作为其局部时空分布一致性特征，并提出了融合兴趣点表观特征的增强单词包算法，采用多类分类支持向量机(Support Vector Machine, SVM)实现分类识别。分别针对单人和多人动作识别，在KTH数据集和UT-interaction数据集上进行实验。与传统单词包算法相比，增强单词包算法不仅提高了识别效率，而且削弱了单词包大小变化对识别率的影响，实验结果验证了算法的有效性。

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	张良
	鲁梦梦
	姜华

关键词 ：人体行为识别, 局部分布特征, 增强单词包模型, 支持向量机

Abstract：

The traditional Bag-Of-Words (BOW) model easy causes confusion of different action classes due to the lack of distribution information among features. And the size of BOW has a large effect on recognition rate. In order to reflect the distribution information of interesting points, the position relationship of interesting points in local spatio-temporal region is calculated as the consistency of distribution features. And the appearance features are fused to build the enhanced BOW model. SVM is adopted for multi-classes recognition. The experiment is carried out on KTH dataset for single person action recognition and UT-interaction dataset for multi-person abnormal action recognition. Compared with traditional BOW model, the enhanced BOW algorithm not only has a great improvement in recognition rate, but also reduces the influence of BOW model’s size on recognition rate. The experiment results of the proposed algorithm show the validity and good performance.

Key words： Human action recognition Local distribution features Enhanced Bag-Of-Words (BOW) model Support Vector Machine (SVM)

收稿日期: 2015-04-08 出版日期: 2016-01-22

PACS:

TP391

基金资助:

国家自然科学基金(61179045)

通讯作者: 张良：男，1970年生，教授，主要研究方向为图像处理、模式识别、智能视频分析. E-mail: stonemark@vip.163.com

作者简介: 张良：男，1970年生，教授，主要研究方向为图像处理、模式识别、智能视频分析.

引用本文:

张良,鲁梦梦,姜华. 局部分布信息增强的视觉单词描述与动作识别[J]. 电子与信息学报, 2016, 38(3): 549-556. ZHANG Liang, LU Mengmeng, JIANG Hua. An Improved Scheme of Visual Words Description and Action Recognition Using Local Enhanced Distribution Information. JEIT, 2016, 38(3): 549-556.

链接本文:

http://jeit.ie.ac.cn/CN/10.11999/JEIT150410 或 http://jeit.ie.ac.cn/CN/Y2016/V38/I3/549

[1]	胡琼, 秦磊, 黄庆明. 基于视觉的人体动作识别综述[J]. 计算机学报, 2013, 36(12): 2512-2524. doi: 10.3724/SP.J.1016. 2013.02512.
	HU Qiong, QIN Lei, and HUANG Qingming. Human action recognition review based on computer vision[J]. Journal of Computer, 2013, 36(12): 2512-2524. doi: 10.3724/SP.J. 1016.2013.02512.
[2]	BEBAR A A and HEMAYED E E. Comparative study for feature detector in human activity recognition[C]. IEEE the 9th International conference on Computer Engineering Conference, Giza, 2013: 19-24. doi:10.1109/ICENCO.2013. 6736470.
[3]	LI F and DU J X. Local spatio-temporal interest point detection for human action recognition[C]. IEEE the 5th International Conference on Advanced Computational Intelligence, Nanjing, 2012: 579-582. doi: 10.1109/ICACI. 2012.6463231.
[4]	ONOFRI L, SODA P, and IANNELLO G. Multiple subsequence combination in human action recognition[J]. IEEE Journal on Computer Vision, 2014, 8(1): 26-34. doi:10.1049/iet-cvi.2013.0015.
[5]	FOGGIA P, PERCANNELLA G, SAGGESE A, et al. Recognizing human actions by a bag of visual words[C]. IEEE International Conference on Systems, Man, and Cybernetics, Manchester, 2013: 2910-2915. doi:10.1109/SMC.2013.496.
[6]	ZHANG X, MIAO Z J, and WAN L. Human action categories using motion descriptors[C]. IEEE 19th International Conference on Image Processing, Orlando, FL, 2012: 1381-1384. doi:10.1109/ICIP.2012.6467126.
[7]	LI Y and KUAI Y H. Action recognition based on spatio-temporal interest point[C]. IEEE the 5th International
[8]	Conference on Biomedical Engineering and Informatics, Chongqing, 2012: 181-185. doi:10.1109/BMEI.2012.6512972.
[9]	REN H and MOSELUND T B. Action recognition using salient neighboring histograms[C]. IEEE the 20th International Conference on Image Processing, Melbourne, VIC, 2013: 2807-2811. doi:10.1109/ICIP.2013.6738578.
[10]	COZAR J R, GONZALEZ-LINARES J M, GUIL N, et al. Visual words selection for human action classification[C]. International Conference on High Performance Computing and Simulation, Madrid, 2012: 188-194. doi: 10.1109/ HPCSim.2012.6266910.
[11]	WANG H R, YUAN C F, HU W M, et al. Action recognition using nonnegative action component representation and sparse basis selection[J]. IEEE Transactions on Image Processing, 2014, 23(2): 570-581. doi:10.1109/TIP.2013. 2292550.
[12]	BILINSKI P and BREMOND F. Contextual statistics of space-time ordered features for human action recognition[C]. IEEE the 9th International Conference on Advanced Video and Signal-based Surveillance, Beijing, 2012: 228-233. doi:10.1109/AVSS.2012.29.
[13]	ZHANG L, ZHEN X T, and Shao L. High order co-occurrence of visualwords for action recognition[C]. IEEE the 19th International Conference on Image Processing, Orlando, FL, 2012: 757-760. doi:10.1109/ICIP.2012.6466970.
[14]	SHAN Y H, ZHANG Z, ZHANG J, et al. Interest point selection with spatio-temporal context for realistic action recognition[C]. IEEE the 9th International Conference on Advanced Video and Signal-based Surveillance, Beijing, 2012: 94-99. doi:10.1109/AVSS.2012.43.
[15]	TIAN Y and RUAN Q Q. Weight and context method for action recognition using histogram Intersection[C]. The 5th IET International Conference on Wireless, Mobile and Multimedia Networks, Beijing, 2013: 229-233. doi:10.1049/ cp.2013.2414.
[16]	LAPTEV I and LIDEBERG T. Space-time interest points[C]. IEEE the 9th International Conference on Computer Vision, Nice, France, 2003: 432-439. doi:10.1109/ICCV.2003. 1238378.
[17]	KLASER A, MARSZALEK M, and SCHMID C. A spatio- temporal descriptor based on 3D-gradients[C]. The 19th Conference on British Machine Vision and Pattern Recognition, Leeds, United Kingdom, 2008: 1-10.