To deal with the problem that the Bag-Of-Visual-Words (BOVW) model discards image spatial structure, a new method based on the Hessian sparse coding for image retrieval is introduced. First, the n-words model is built in order to obtain the local feature representation. The n-words model can establish a high-level description using a series of visual word sequences to represent an image. The experiments are performed from n=1 to n=5 to seek the proper n. Second, the Hessian sparse coding formulation is acquired by incorporating the Hessian energy function into the standard sparse coding formulation. Finally, using the obtained n-words sequences as the encoding features, the optimal Hessian coefficients are calculated through the feature-sign search algorithm. The similarity is computed and the retrieval results are returned. The experiments are performed on the two datasets, the results show that the proposed new method for image retrieval outperforms the BOVW model and existent methods.
SIVIC J and ZISSERMAN A. Video google: A text retrieval approach to object matching in videos[C]. Proceedings of the Ninth IEEE International Conference on Computer Vision, Nice, France, 2003: 1470-1477. doi: 10.1109/ICCV.2003. 1238663.
[2]
LAZEBNIK S, SCHMID C, and PONCE J. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories[C]. Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York, USA, 2006: 2169-2178. doi: 10.1109/CVPR.2006. 68.
[3]
ZHANG Shiliang, TIAN Qi, HUA Gang, et al. Generating descriptive visual words and visual phrases for large-scale image applications[J]. IEEE Transactions on Image Processing, 2011, 20(9): 2664-2677. doi: 10.1109/TIP. 2011. 2128333.
[4]
CHEN Tao, YAP Kimhui, and ZHANG Dajiang. Discriminative bag-of-visual phrase learning for landmark recognition[C]. Proceedings of the 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, Kyoto, Japan, 2012: 893-896. doi: 10.1109/ICASSP.2012. 6288028.
[5]
YANG Meng, ZHANG Lei, YANG Jian, et al. Robust sparse coding for face recognition[C]. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Colorado, Springs, USA, 2011: 625-632. doi: 10.1109/CVPR.2011.5995393.
[6]
LIU Weifeng, TAO Dacheng, CHENG Jun, et al. Multiview Hessian discriminative sparse coding for image annotation[J]. Computer Vision and Image Understanding, 2014, 118: 50-60. doi: 10.1016/j.cviu.2013.03.007.
[7]
REDDY M K, TALUR J, and BABU R V. Sparse coding based VLAD for efficient image retrieval[C]. Proceedings of the 2014 IEEE International Conference on Electronics, Computing and Communication Technologies, Bangalore, India, 2014: 1-4. doi: 10.1109/CONECCT.2014.6740340.
[8]
LIU Qiegen, YING Leslie, and LIANG Dong. An efficient augmented Lagrangian algorithm for graph regularized sparse coding in clustering[C]. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, Canada, 2013: 1656-1660. doi: 10. 1109/ICASSP.2013.6637933.
QIAN Zhiming, ZHONG Ping, and WANG Runsheng. Automatic image annotation via graph regularization and non-negative group sparsity[J]. Journal of Electronics & Information Technology, 2015, 37(4): 784-790. doi: 10.11999/ JEIT141282.
LIU Zhe, YANG Jing, and CHEN Lu. Super-resolution image restoration based on nonlocal sparse coding[J]. Journal of Electronics & Information Technology, 2015, 37(3): 522-528. doi: 10.11999/JEIT140481.
[11]
YANG Jianchao, YU Kai, GONG Yihong, et al. Linear spatial pyramid matching using sparse coding for image classification[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami, Florida, USA, 2009: 1794-1801. doi: 10.1109/CVPRW.2009.5206757.
[12]
WANG Jinjun, YANG Jianchao, YU Kai, et al. Locality- constrained linear coding for image classification[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, San Francisco, California, USA, 2010: 3360-3367. doi: 10.1109/CVPR.2010.5540018.
[13]
GAO Shenghua, TSANG Ivor WaiHung, and CHIA Liangtien. Laplacian sparse coding, hypergraph Laplacian sparse coding, and applications[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(1): 92-104. doi: 10.1109/ TPAMI.2012.63.
[14]
PEDROSA G V and TRAINA A J M. From bag-of-visual- words to bag-of-visual-phrases using n-grams[C]. Proceedings of the 2013 XXVI Conference on Graphics, Patterns and Images, Arequipa, Peru, 2013: 304-311. doi: 10.1109/ SIBGRAPI.2013.49.
[15]
SUEN C Y. N-gram statistics for natural language understanding and text processing[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1978, 1(2): 164-172.
[16]
ZHENG Miao, BU Jiajun, and CHEN Chun. Hessian sparse coding[J]. Neurocomputing, 2014, 123: 247-254. doi: 10.1016/ j.neucom.2013.08.001.
[17]
LEE H, BATTLE A, RAINA R, et al. Efficient sparse coding algorithms[C]. Proceedings of the Advances in Neural Information Processing Systems, Vancouver, British Columbia, Canada, 2007: 801-808.
[18]
KIM K, STEINKE F, and HEIN M. Semi-supervised regression using Hessian energy with an application to semi-supervised dimensionality reduction[C]. Proceedings of the Advances in Neural Information Processing Systems, Vancouver, Canada, 2009: 979-987.
[19]
LI Fefei, ROB F, and PIETRO P. Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories[J]. Computer Vision and Image Understanding, 2007, 106: 59-70. doi: 10.1016/j. cviu.2005.09.012.
[20]
POWERS D M W. Evaluation: from precision, recall and F-measure to ROC, informedness, markedness & correlation [J]. Journal of Machine Learning Technologies, 2011, 2(1): 37-63.
[21]
TURPIN A and SCHOLER F. User performance versus precision measures for simple search tasks[C]. Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, Washington, USA, 2006: 11-18.