融合时空特征的视频序列表情识别

doi:10.11999/JEIT170592

摘要
图/表
参考文献(21)
相关文章 (5)

全文: PDF (1995 KB)
输出: BibTeX | EndNote (RIS)

摘要针对视频表情识别，静态特征不能有效描述人脸区域沿时间轴动态变化信息的局限，该文提出一种融合动态纹理信息和运动信息的表情识别方法，借鉴LBP-TOP原理，提出具有时空域描述能力的时空韦伯局部描述子(STWLD)来提取动态纹理信息，同时采用分块光流直方图(BHOF)描述运动信息，最后利用SVM对融合后的纹理和运动信息完成表情分类。在CK+和MMI表情数据库上的交叉实验结果表明，相比基于单一特征的识别方法，所提方法取得了更好的效果；与其他相关方法的对比实验也验证了该方法的优越性。

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	王晓华
	夏晨
	胡敏
	任福继

关键词 ：视频序列, 表情识别, 时空韦伯局部描述子, 分块光流直方图特征

Abstract：For facial expression recognition based on video sequences, the changing information of facial regions along the time axis can be described by dynamic descriptors more effectively than static descriptors. This paper proposes an expression recognition method based on the dynamic texture and motion information, learning from the principle of Local Binary Pattern on Three Orthogonal Planes (LBP-TOP), Spatio-Temporal Weber Local Descriptor (STWLD) is proposed to describe the dynamic texture feature information of the facial expression sequence. Moreover, using Block-based Histogram of Optical Flow features (BHOF), the motion information can be described. Through the combination of the dynamic texture and motion information, and finally SVM is applied to complete the expression classification. The results of the cross experiments on the CK + and MMI expression database show that the method achieves better performance than methods using the single descriptors. The comparison experiments with other related methods also prove the superiority of the method.

Key words： Video sequences Expression recognition Spatio-Temporal Weber Local Descriptor (STWLD) Block- based Histogram of Optical Flow (BHOF)

收稿日期: 2017-06-20 出版日期: 2017-12-27

PACS:

TP391.43

基金资助:国家自然科学基金(61672202, 61432004, 61300119)，国家自然科学基金深圳联合基金重点项目(U1613217)，江苏省物联网移动互联技术工程实验室开放课题(JSWLW-2017-017)

通讯作者: 王晓华：女，1976年生，副教授，研究方向为数字图像处理、情感计算、计算机视觉. E-mail: xh_wang@hfut.edu.cn

作者简介: 王晓华：女，1976年生，副教授，研究方向为数字图像处理、情感计算、计算机视觉. 夏晨：男，1992年生，硕士生，研究方向为计算机视觉、模式识别. 胡敏：女，1967年生，教授，研究方向为数字图像处理、计算机视觉、模式识别. 任福继：男，1959年生，教授，研究方向为信号与信息处理、情感计算、计算机视觉、模式识别.

引用本文:

王晓华, 夏晨, 胡敏, 任福继. 融合时空特征的视频序列表情识别[J]. 电子与信息学报, 2018, 40(3): 626-632. WANG Xiaohua, XIA Chen, HU Min, REN Fuji. Facial Expression Recognition Based on the Fusion of Spatio-temporal Features in Video Sequences. JEIT, 2018, 40(3): 626-632.

链接本文:

http://jeit.ie.ac.cn/CN/10.11999/JEIT170592 或 http://jeit.ie.ac.cn/CN/Y2018/V40/I3/626

[1]	CHEON Y and KIM D. Natural facial expression recognition using differential-AAM and manifold learning[J]. Pattern Recognition, 2009, 42(7): 1340-1350. doi: 10.1016/j.patcog. 2008.10.010.
[2]	PAN Z, POLCEANU M, and LISETTI C. On constrained local model feature normalization for facial expression recognition[C]. International Conference on Intelligent Virtual Agents. Los Angeles, CA, USA, 2016: 369-372. doi: 10.1007/978-3-319-47665-0_35.
[3]	ZHU X and RAMANAN D. Face detection, pose estimation, and landmark localization in the wild[C]. 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA, 2012: 2879-2886. doi: 10.1109 /CVPR.2012.6248014.
[4]	ZHAO L, WANG Z, and ZHANG G. Facial expression recognition from video sequences based on spatial-temporal motion local binary pattern and Gabor multiorientation fusion histogram[J]. Mathematical Problems in Engineering, 2017, (1): 1-12. doi: 10.1155/2017/7206041.
[5]	ZHOU J, ZHANG S, MEI H, et al. A method of facial expression recognition based on Gabor and NMF[J]. Pattern Recognition and Image Analysis, 2016, 26(1): 119-124. doi: 10.1134/S1054661815040070.
[6]	CHEN J, SHAN S, HE C, et al. WLD: A robust local image descriptor[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9): 1705-1720. doi: 10.1109/ TPAMI.2009.155.
[7]	ZHAO G and PIETIKAINEN M. Dynamic texture recognition using local binary patterns with an application to facial expressions[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(6): 915-928. doi: 10.1109/ TPAMI.2007.111.
[8]	付晓峰, 付晓鹃, 李建军, 等. 视频序列中基于多尺度时空局部方向角模式直方图映射的表情识别[J]. 计算机辅助设计与图形学学报, 2015, 27(6): 1060-1066.
	FU Xiaofeng, FU Xiaojuan, LI Jianjun, et al. Facial expression recognition using multi-scale spatiotemporal local orienta-tional pattern histogram projection in video sequences[J]. Journal of Computer Aided Design & Computer Graphics, 2015, 27(6): 1060-1066.
[9]	KAMAROL S K A, JAWARD M H, PARKKINEN J, et al. Spatiotemporal feature extraction for facial expression recognition[J]. IET Image Processing, 2016, 10(7): 534-541. doi: 10.1049/iet-ipr.2015.0519.
[10]	MEINHARDT-Llopis E, P?REZ J S, and KONDERMANN D. Horn-schunck optical flow with a multi-scale strategy[J]. Image Processing on Line, 2013, 20: 151-172. doi: 10.5201/ ipol.2013.20.
[11]	张轩阁, 田彦涛, 颜飞, 等. 基于全局光流特征的微表情识别[J]. 模式识别与人工智能, 2016, 29(8): 760-768. doi: 10.16451 /j.cnki.issn1003-6059.201608011.
	ZHANG Xuange, TIAN Yantao, YAN Fei, et al. Micro- expression recognition based on global optical flow feature[J]. Pattern Recognition and Artificial Intelligence. 2016, 29(8): 760-768. doi: 10.16451/j.cnki.issn1003-6059.201608011.
[12]	YACOOB Y and DAVIS L S. Recognizing human facial expressions from long image sequences using optical flow[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1996, 18(6): 636-642. doi: 10.1109/34.506414.
[13]	LUCEY P, COHN J F, KANADE T, et al. The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression[C]. 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), California, USA, 2010: 94-101. doi: 10.1.1.182.3759.
[14]	PANTIC M, VALSTAR M, RADEMAKER R, et al. Web-based database for facial expression analysis[C]. IEEE International Conference on Multimedia and Expo, Amsterdam, The Netherlands, 2005: 317-321. doi: 10.1109/ ICME.2005.1521424.
[15]	邱玉, 赵杰煜, 汪燕芳. 结合运动时序性的人脸表情识别方法[J]. 电子学报, 2016, 44(6): 1307-1313. doi: 10.3969/j.issn. 0372-2112.2016.06.007.
	QIU Yu, ZHAO Jieyu, and WANG Yanfang. Facial expression recognition using temporal relations among facial movements[J]. Acta Electronica Sinica, 2016, 44(6): 1307-1313. doi: 10.3969/j.issn.0372-2112.2016.06.007.
[16]	FAN X and TJAHJADI T. A spatial-temporal framework based on histogram of gradients and optical flow for facial expression recognition in video sequences[J]. Pattern Recognition, 2015, 48(11): 3407-3416. doi: 10.1016/j.patcog. 2015.04.025.
[17]	LONG F and BARTLETT M S. Video-based facial expression recognition using learned spatiotemporal pyramid sparse coding features[J]. Neurocomputing, 2016, 173: 2049-2054. doi: 10.1016/j.neucom.2015.09.049
[18]	GUPTA O, RAVIV D, and RASKAR R. Multi-velocity neural networks for facial expression recognition in videos[J]. IEEE Transactions on Affective Computing, 1949, 99: 1.
[19]	FANG H, MAC Parthaláin N, AUBREY A J, et al. Facial expression recognition in dynamic sequences: An integrated approach[J]. Pattern Recognition, 2014, 47(3): 1271-1281. doi: 10.1016/j.patcog.2013.09.023.
[20]	WANG Z, WANG S, and Ji Q. Capturing complex spatio- temporal relations among facial muscles for facial expression recognition[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA. 2013: 3422-3429. doi: 10.1109/CVPR.2013.439.
[21]	JUNG H, LEE S, YIM J, et al. Joint fine-tuning in deep neural networks for facial expression recognition[C]. Proceedings of the IEEE International Conference on Computer Vision. Santiago, Chile, 2015: 2983-2991. doi: 10.1109/ICCV.2015.341.