|
|
Video Coding Parameters Effect on Object Recognition |
Wu Ze-min Liu Tao Jiang Qing-zhu Hu Lei |
(College of Communications Engineering, PLA University of Science and Technology, Nanjing 210007, China) |
|
|
Abstract Researchers have done a great number of studies on the object recognition and the video coding transmission respectively. However, there are still no public reports about the influence on the object recognition raised by the video encoding parameters. For this issue, the Deformable Part Model (DPM), a typical object recognition algorithm and the most commonly-used video encoding methods-H.264/AVC are chosen as the test objects. In order to study how the code rates and the resolution affect the performance of video object recognition, the coding and detection experiments are designed and the function of recognition performance changes caused by the code rates and the resolution is fitted. The result shows that the compromise can be achieved between the channel bandwidth and the video object recognition performance through selecting the appropriate the code rates and the resolution parameters for the encoder which provides basis for encoding optimization object function of different video applications.
|
Received: 18 December 2014
Published: 02 June 2015
|
|
Corresponding Authors:
Liu Tao
E-mail: ltaoliu_tao@foxmail.com
|
|
|
|
[1] |
Li L J and Li F F. What, where and who? classifying events by scene and object recognition[C]. Proceedings of the IEEE 11th International Conference on Computer Vision, Rio de Janeiro, Brazil, 2007: 1-8.
|
[2] |
Lei B, Wang T, Chen S, et al.. Object recognition based on adapative bag of feature and discriminative learning[C]. Proceedings of the 20th IEEE International Conference on Image Processing, Melbourne, Australia, 2013: 3390-3393.
|
[3] |
Dalal N and Triggs B. Histograms of oriented gradients for human detection[C]. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, USA, 2005, 1: 886-893.
|
[4] |
Wei D, Zhao Y, Cheng R, et al.. An enhanced histogram of oriented gradient for pedestrian detection[C]. Proceedings of the 4th IEEE International Conference on Intelligent Control and Information Processing, Beijing, China, 2013: 459-463.
|
[5] |
Felzenszwalb P F, Girshick R B, McAllester D, et al.. Object detection with discriminatively trained part-based models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9): 1627-1645.
|
[6] |
Ding Y, Zhang J, Li J, et al.. A bag-of-feature model for video semantic annotation[C]. Proceedings of the 6th IEEE International Conference on Image and Graphics, Hefei, China, 2011: 696-701.
|
[7] |
Huang D K, Chen K Y, and Cheng S C. Video object detection by model-based tracking[C]. Proceedings of the 20th IEEE International Symposium on Circuits and Systems, Beijing, China, 2013: 2384-2387.
|
[8] |
Blair C, Robertson N M, and Hume D. Characterizing a heterogeneous system for person detection in video using histograms of oriented gradients: power versus speed versus accuracy[J]. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 2013, 3(2): 236-247.
|
[9] |
Liu Y, Jang Y, Woo W, et al.. Video-based object recognition using novel set-of-sets representations[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Columbus, USA, 2014: 533-540.
|
[10] |
Sharma P, Huang C, and Nevatia R. Unsupervised incremental learning for improved object detection in a video[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Providence, USA, 2012: 3298-3305.
|
[11] |
Wu Q and Li H. Mode dependent down-sampling and interpolation scheme for high efficiency video coding[J]. Signal Processing: Image Communication, 2013, 28(6): 581-596.
|
[12] |
Wang T, Chen Y, He Y, et al.. A real-time rate control scheme and hardware implementation for H. 264/AVC
|
|
encoders[C]. Proceedings of the 5th IEEE International Congress on Image and Signal Processing, Chongqing, China, 2012: 5-9.
|
[13] |
Felzenszwalb P F and Huttenlocher D P. Pictorial structures for object recognition[J]. International Journal of Computer Vision, 2005, 61(1): 55-79.
|
[14] |
Felzenszwalb P F, Girshick R B, and McAllester D. Cascade object detection with deformable part models[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, San Francisco, USA, 2010: 2241-2248.
|
[15] |
Girshick R B, Felzenszwalb P F, and Mcallester D A. Object detection with grammar models[C]. Proceedings of the 25th IEEE Conference on Advances in Neural Information Processing Systems, Granada, Spain, 2011: 442-450.
|
[16] |
袁武, 林守勋, 牛振东, 等. H. 264/AVC 码率控制优化算法[J]. 计算机学报, 2008, 31(2): 329-339.
|
|
Yuan W, Lin S X, Niu Z D, et al.. Efficient rate control schemes for H.264/AVC[J]. Chinese Journal of Computers, 2008, 31(2): 329-339.
|
[17] |
魏江, 刘迪. 基于DM642的X.264编码器优化[J]. 现代电子技术, 2011, 34(14): 68-70.
|
|
Wei J and Liu D. Optimization of X.264 encoder based on DM642 platform[J]. Modern Electronics Technique, 2011, 34(14): 68-70.
|
[18] |
Huang Y H, Ou T S, and Su P Y. Perceptual rate distortion optimization using structural similarity index as quality metric[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2010, 20(11): 1614-1624.
|
[19] |
Ou T S, Huang Y H, and Chen H H. SSIM-based perceptual rate control for video coding[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2011, 21(5): 682-691.
|
[20] |
Wang R, Huang C, and Chang P. Adaptive downsampling video coding with spatially scalable rate-distortion modeling [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2014, 24(11): 1957-1968.
|
|
|
|