<em>n</em>-words模型下Hesse稀疏表示的图像检索算法

doi:10.11999/JEIT150617

摘要
图/表
参考文献(22)
相关文章 (14)

全文: PDF (2922 KB)
输出: BibTeX | EndNote (RIS)

摘要

论文针对视觉词袋(BOVW)模型放弃图像空间结构的缺点，提出一种基于Hesse稀疏编码的图像检索算法。首先，建立n-words模型，获得图像局部特征表示。n-words模型由一系列连续视觉词获得，是图像特征的一种高级描述。该文从n=1到n=5进行试验，寻找最恰当的n值；其次，将二阶Hesse能量函数融入标准稀疏编码的目标函数，得到Hesse稀疏编码公式；最后，以获得的n-words序列作为编码特征，利用特征符号搜索算法求解最优Hesse系数，计算相似度，返回检索结果。实验在两类数据集上进行，与BOVW模型和已有的算法相比，新算法极大地提高了图像检索的准确率。

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	王瑞霞
	彭国华

关键词 ：图像检索, 稀疏编码, 视觉词袋模型, n-words模型, Hesse能量函数

Abstract：

To deal with the problem that the Bag-Of-Visual-Words (BOVW) model discards image spatial structure, a new method based on the Hessian sparse coding for image retrieval is introduced. First, the n-words model is built in order to obtain the local feature representation. The n-words model can establish a high-level description using a series of visual word sequences to represent an image. The experiments are performed from n=1 to n=5 to seek the proper n. Second, the Hessian sparse coding formulation is acquired by incorporating the Hessian energy function into the standard sparse coding formulation. Finally, using the obtained n-words sequences as the encoding features, the optimal Hessian coefficients are calculated through the feature-sign search algorithm. The similarity is computed and the retrieval results are returned. The experiments are performed on the two datasets, the results show that the proposed new method for image retrieval outperforms the BOVW model and existent methods.

Key words： Image retrieval Sparse coding Bag-Of-Visual-Words (BOVW) model n-words model Hesse energy function

收稿日期: 2015-05-20 出版日期: 2016-03-11

PACS:

TP391

基金资助:

国家自然科学基金(61201323)

通讯作者: 王瑞霞：女，1984年生，博士生，研究方向为基于内容的图像检索. E-mail: wangruixia921@163.com

作者简介: 王瑞霞：女，1984年生，博士生，研究方向为基于内容的图像检索. 彭国华：男，1962年生，博士生导师，研究方向为计算机图形学、计算机辅助几何处理、图像处理、计算机视觉.

引用本文:

王瑞霞,彭国华. n-words模型下Hesse稀疏表示的图像检索算法[J]. 电子与信息学报, 2016, 38(5): 1115-1122. WANG Ruixia, PENG Guohua. Hesse Sparse Representation under n-words Model for Image Retrieval. JEIT, 2016, 38(5): 1115-1122.

链接本文:

http://jeit.ie.ac.cn/CN/10.11999/JEIT150617 或 http://jeit.ie.ac.cn/CN/Y2016/V38/I5/1115

[1]	SIVIC J and ZISSERMAN A. Video google: A text retrieval approach to object matching in videos[C]. Proceedings of the Ninth IEEE International Conference on Computer Vision, Nice, France, 2003: 1470-1477. doi: 10.1109/ICCV.2003. 1238663.
[2]	LAZEBNIK S, SCHMID C, and PONCE J. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories[C]. Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York, USA, 2006: 2169-2178. doi: 10.1109/CVPR.2006. 68.
[3]	ZHANG Shiliang, TIAN Qi, HUA Gang, et al. Generating descriptive visual words and visual phrases for large-scale image applications[J]. IEEE Transactions on Image Processing, 2011, 20(9): 2664-2677. doi: 10.1109/TIP. 2011. 2128333.
[4]	CHEN Tao, YAP Kimhui, and ZHANG Dajiang. Discriminative bag-of-visual phrase learning for landmark recognition[C]. Proceedings of the 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, Kyoto, Japan, 2012: 893-896. doi: 10.1109/ICASSP.2012. 6288028.
[5]	YANG Meng, ZHANG Lei, YANG Jian, et al. Robust sparse coding for face recognition[C]. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Colorado, Springs, USA, 2011: 625-632. doi: 10.1109/CVPR.2011.5995393.
[6]	LIU Weifeng, TAO Dacheng, CHENG Jun, et al. Multiview Hessian discriminative sparse coding for image annotation[J]. Computer Vision and Image Understanding, 2014, 118: 50-60. doi: 10.1016/j.cviu.2013.03.007.
[7]	REDDY M K, TALUR J, and BABU R V. Sparse coding based VLAD for efficient image retrieval[C]. Proceedings of the 2014 IEEE International Conference on Electronics, Computing and Communication Technologies, Bangalore, India, 2014: 1-4. doi: 10.1109/CONECCT.2014.6740340.
[8]	LIU Qiegen, YING Leslie, and LIANG Dong. An efficient augmented Lagrangian algorithm for graph regularized sparse coding in clustering[C]. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, Canada, 2013: 1656-1660. doi: 10. 1109/ICASSP.2013.6637933.
[9]	钱智明, 钟平, 王润生. 基于图正则化与非负组稀疏的自动图像标注[J]. 电子与信息学报, 2015, 37(4): 784-790. doi: 10. 11999/JEIT141282.
	QIAN Zhiming, ZHONG Ping, and WANG Runsheng. Automatic image annotation via graph regularization and non-negative group sparsity[J]. Journal of Electronics & Information Technology, 2015, 37(4): 784-790. doi: 10.11999/ JEIT141282.
[10]	刘哲, 杨静, 陈路. 基于非局部稀疏编码的超分辨率图像复原[J]. 电子与信息学报, 2015, 37(3): 522-528. doi: 10.11999/ JEIT140481.
	LIU Zhe, YANG Jing, and CHEN Lu. Super-resolution image restoration based on nonlocal sparse coding[J]. Journal of Electronics & Information Technology, 2015, 37(3): 522-528. doi: 10.11999/JEIT140481.
[11]	YANG Jianchao, YU Kai, GONG Yihong, et al. Linear spatial pyramid matching using sparse coding for image classification[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami, Florida, USA, 2009: 1794-1801. doi: 10.1109/CVPRW.2009.5206757.
[12]	WANG Jinjun, YANG Jianchao, YU Kai, et al. Locality- constrained linear coding for image classification[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, San Francisco, California, USA, 2010: 3360-3367. doi: 10.1109/CVPR.2010.5540018.
[13]	GAO Shenghua, TSANG Ivor WaiHung, and CHIA Liangtien. Laplacian sparse coding, hypergraph Laplacian sparse coding, and applications[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(1): 92-104. doi: 10.1109/ TPAMI.2012.63.
[14]	PEDROSA G V and TRAINA A J M. From bag-of-visual- words to bag-of-visual-phrases using n-grams[C]. Proceedings of the 2013 XXVI Conference on Graphics, Patterns and Images, Arequipa, Peru, 2013: 304-311. doi: 10.1109/ SIBGRAPI.2013.49.
[15]	SUEN C Y. N-gram statistics for natural language understanding and text processing[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1978, 1(2): 164-172.
[16]	ZHENG Miao, BU Jiajun, and CHEN Chun. Hessian sparse coding[J]. Neurocomputing, 2014, 123: 247-254. doi: 10.1016/ j.neucom.2013.08.001.
[17]	LEE H, BATTLE A, RAINA R, et al. Efficient sparse coding algorithms[C]. Proceedings of the Advances in Neural Information Processing Systems, Vancouver, British Columbia, Canada, 2007: 801-808.
[18]	KIM K, STEINKE F, and HEIN M. Semi-supervised regression using Hessian energy with an application to semi-supervised dimensionality reduction[C]. Proceedings of the Advances in Neural Information Processing Systems, Vancouver, Canada, 2009: 979-987.
[19]	LI Fefei, ROB F, and PIETRO P. Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories[J]. Computer Vision and Image Understanding, 2007, 106: 59-70. doi: 10.1016/j. cviu.2005.09.012.
[20]	POWERS D M W. Evaluation: from precision, recall and F-measure to ROC, informedness, markedness & correlation [J]. Journal of Machine Learning Technologies, 2011, 2(1): 37-63.
[21]	TURPIN A and SCHOLER F. User performance versus precision measures for simple search tasks[C]. Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, Washington, USA, 2006: 11-18.
[22]	王瑞霞, 彭国华, 郑红婵. 拉普拉斯稀疏编码的图像检索算法[J]. 计算机科学, 2014, 41(8): 278-280. doi: 10.11896/j.issn. 1002-137X.2014.08.058.
	WANG Ruixia, PENG Guohua, and ZHENG Hongchan. Image retrieval algorithm based on Laplacian sparse coding [J]. Computer Science, 2014, 41(8): 278-280. doi: 10.11896/ j.issn.1002-137X.2014.08.058.