RUAN Xiaogang LIN Jia YU Naigong ZHU Xiaoqing OUATTARA Sie
(College of Electronic Information and Control Engineering, Beijing University of Technology, Beijing 100124, China)
(Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing 100124, China)
分割运动手部时,为了不依赖不合理的假设和解决手脸遮挡问题,该文提出一种基于肤色、灰度、深度和运动线索的分割方法。首先,利用灰度与深度光流的方差信息来自适应提取运动感兴趣区域(Motion Region of Interest, MRoI),以定位人体运动部位。然后,在MRoI中检测满足肤色与自适应运动约束的角点作为皮肤种子点。接着,根据肤色、深度与运动准则将皮肤种子点生长为候选手部区域。最后,通过边缘深度梯度、骨架提取和最优路径搜索从候选手部区域中分割出运动手部区域。实验结果表明,在不同情形下,特别是手脸遮挡时,该方法可以有效和准确地分割出运动手部区域。
For moving hand segmentation, in order not to use unreasonable assumptions and to solve the hand-face occlusion, a segmentation method based on skin color, grayscale, depth and motion cues is proposed. Firstly, according to the variance information of grayscale and depth optical flow, Motion Region of Interest (MRoI) is adaptively extracted to locate the moving body part. Then, corners which satisfy skin color and adaptive motion constraints are detected as skin seed points in the MRoI. Next, skin seed points are grown to [JL1]obtain candidate hand region utilizing skin color, depth and motion criterions. Finally, edge depth gradient, skeleton extraction and optimal path search are employed to segment moving hand region from candidate hand region. Experiment results show that the proposed method can effectively and accurately segment moving hand region under different circumstances, especially when the face is occluded by the hand.
ZHANG Xudong, YANG Jing, HU Liangmei, et al. Human activity recognition using Multi-Layered motion history images with Time-Of-Flight camera[J]. Journal of Electronic & Information Technology, 2014, 36(5): 1139-1144. doi: 10.3724/SP.J.1146.2013.01003.
[3]
WAN Jun, RUAN Qiuqi, LI Wei, et al. One-shot learning gesture recognition from RGB-D Data using Bag of Features [J]. Journal of Machine Learning Research, 2013, 14: 2549-2582.
[4]
WAN Jun, GUO Guodong, and LI S Z. Explore efficient local features from RGB-D data for one-shot learning gesture recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(8): 1626-1639. doi: 10.1109/ TPAMI.2015.251347.
[5]
HERNANDEZ-VELA A, BAUTISTA M A, PEREZ-SALA X, et al. Probability-based dynamic time warping and bag-of- visual-and-depth -words for human gesture recognition in RGB-D[J]. Pattern Recognition Letters, 2014, 50[JL2]: 112-121. doi: 10.1016/j.patrec.2013.09.009.
[6]
MONNIER C, GERMAN S, and OST A. A multi-scale boosted detector for efficient and robust gesture recognition [C]. Proceedings of the European Conference on Computer Vision Workshops, Zurich, 2014: 491-502. doi: 10.1007/978- 3-319-16178-5_34.
[7]
PFISTER T, CHARLES J, and ZISSERMAN A. Domain- adaptive discriminative one-shot learning of gestures[C]. Proceedings of the European Conference on Computer Vision, Zurich, 2014: 814-829. doi: 10.1007/978-3-319-10599-4_52.
[8]
YANG H D and LEE S W. Simultaneous spotting of signs and fingerspellings based on hierarchical conditional random fields and boostmap embeddings[J]. Pattern Recognition, 2010, 43(8): 2858-2870. doi: 10.1016/ j.patcog.2010.03.007.
[9]
FANELLO S R, GORI I, METTA G, et al. Keep it simple and sparse: Real-time action recognition[J]. Journal of Machine Learning Research, 2013, 14: 2617-2640.
[10]
LUI Yuiman. Human gesture recognition on product manifolds[J]. Journal of Machine Learning Research, 2012, 13: 3297-3321.
[11]
KONECNY J and HAGARA M. One-shot-learning gesture recognition using HOG-HOF Features[J]. Journal of Machine Learning Research, 2014, 15: 2513-2532.
[12]
JADOOKI S, MOHAMAD D, TANZILA S, et al. Fused features mining for depth-based hand gesture recognition to classify blind human communication[J]. Neural Computing and Applications, 2016: 1-10. doi: 10.1007/s00521-016- 2244- 5.
[13]
DOMINIO F, DONADEO M, and ZANUTTIGH P. Combining multiple depth-based descriptors for hand gesture recognition[J]. Pattern Recognition Letters, 2014, 50: 101-111. doi: 10.1016/j.patrec.2013.10.010.
[14]
LIANG Hui, YUAN Junsong, and THALMANN D. 3D fingertip and palm tracking in depth image sequences[C]. Proceedings of the ACM International Conference on Multimedia, Nara, 2012: 785-788. doi: 10.1145/2393347. 2396312.
[15]
JIANG Hairong, DUERSTOCK B S, and WACHS J P. A machine vision-based gestural interface for people with upper extremity physical impairments[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2014, 44(5): 630-641. doi: 10.1109/TSMC.2013.2270226.
[16]
CHENG Hong, DAI Zhongjun, LIU Zicheng, et al. An image-to-class dynamic time warping approach for both 3D static and trajectory hand gesture recognition[J]. Pattern Recognition, 2016, 55: 137-147. doi: 10.1016/j.patcog.2016.01. 011.
[17]
WANG Chong, LIU Zhong, and CHAN Shingchow. Superpixel-based hand gesture recognition with Kinect depth camera[J]. IEEE Transactions on Multimedia, 2015, 17(1): 29-39. doi: 10.1109/TMM.2014.2374357.
[18]
CHEMYSHOV V and MESTETSKIY L. Real-time hand detection using continuous skeletons[J]. Pattern Recognition and Image Analysis, 2016, 26(2): 368-373. doi: 10.1134/ S1054661816020048.
NAN Dong, BI Duyan, ZHA Yufei, et al. A no-reference image quality assessment method based on parameter estimation[J]. Journal of Electronic & Information Technology, 2013, 35(9): 2066-2072. doi: 10.3724/SP.J.1146. 2012.01652.
[20]
JONES M J and REHG J M. Statistical color models with application to skin detection[J]. International Journal of Computer Vision, 2002, 46(1): 81-96. doi: 10.1023/A: 1013200319198.
STERGIOPOULOU E, SGOUROPOULOS K, NIKOLAOU N, et al. Real time hand detection in a complex background [J]. Engineering Applications of Artificial Intelligence, 2014, 35[JL3]: 54-70. doi: 10.1016/j.engappai.2014.06.006.
[23]
LIN Jia, RUAN Xiaogang, YU Naigong, et al. One-shot learning gesture recognition based on improved 3D SMoSIFT feature descriptor from RGB-D videos[C]. Proceedings of the Chinese Control and Decision Conference, Qingdao, 2015: 4911-4916. doi: 10.1109/CCDC.2015.7162803.
[24]
FARNEBACK G. Two-frame motion estimation based on polynomial expansion[C]. Proceedings of the Scandinavian Conference on Image Analysis, Halmstad, 2003: 363-370. doi: 10.1007/3-540-45103-X_50.
[25]
SHI J and TOMASI G. Good feature to track[C]. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Seattle, 1994: 593-600. doi: 10.1109/CVPR.1994.323794.
[26]
GONZALEZ R C and WOODS R E. Digital Image Processing[M]. Beijing: Publishing House of Electronic Industry, 2008: 543-545.
[27]
CALABI L and HARNETT W E. Shape recognition, prairie fires, convex deficiencies and skeletons[J]. The American Mathematical Monthly, 1968, 75(4): 335-342. doi: 10.2307/ 2313409.
[28]
DIJKSTRA E W. A note on two problems in connexion with graphs[J]. Numerische Mathematik, 1959, 1(1): 269-271. doi: 10.1007/BF01386390.