Affective Abstract Image Classification Based on Convolutional Sparse Autoencoders across Different Domains
FAN Yangyu① LI Zuhe①② WANG Fengqin② MA Jiangtao②
①(School of Electronics and Information, Northwestern Polytechnical University, Xi’an 710072, China) ②(School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou 450002, China)
To apply unsupervised feature learning to emotional semantic analysis for images in small sample size situations, convolutional sparse autoencoder based self-taught learning for domain adaption is adopted for affective classification of a small amount of labeled abstract images. To visually compare the results of feature learning on different domains, an average gradient criterion based method is further proposed for the sorting of weights learned by sparse autoencoders. Image patches are first randomly collected from a large number of unlabeled images in the source domain and local features are learned using a sparse autoencoder. Then the weight matrices corresponding to different features are sorted according to the minimal average gradient of each matrix in three color channels. Global feature activations of labeled images in the target domain are finally obtained by a convolutional neural network including a pooling layer and sent into a logistic regression model for affective classification. Experimental results show that self-taught learning based domain adaption can provide training data for the application of unsupervised feature learning in target domains with limited samples. Sparse autoencoder based feature learning across different domains can produce better identification effect than low-level visual features in emotional semantic analysis of a limited number of abstract images.
樊养余,李祖贺,王凤琴,马江涛. 基于跨领域卷积稀疏自动编码器的抽象图像情绪性分类[J]. 电子与信息学报, 2017, 39(1): 167-175.
FAN Yangyu, LI Zuhe, WANG Fengqin, MA Jiangtao. Affective Abstract Image Classification Based on Convolutional Sparse Autoencoders across Different Domains. JEIT, 2017, 39(1): 167-175.
BORTH D, JI R, CHEN T, et al. Large-scale visual sentiment ontology and detectors using adjective noun pairs[C]. 21st ACM International Conference on Multimedia, Barcelona, Spain, 2013: 223-232. doi: 10.1145/2502081.2502282.
LI Zuhe and FAN Yangyu. Survey on visual sentiment analysis[J]. Application Research of Computers, 2015, 32(12): 3521-3526. doi: 10.3969/j.issn.1001-3695.2015.12.001.
[3]
MACHAJDIK J and HANBURY A. Affective image classification using features inspired by psychology and art theory[C]. 18th ACM International Conference on Multimedia, Firenze, Italy, 2010: 83-92. doi: 10.1145/ 1873951.1873965.
[4]
ZHANG H, G?NEN M, YANG Z, et al. Understanding emotional impact of images using Bayesian multiple kernel learning[J]. Neurocomputing, 2015, 165: 3-13. doi: 10.1016/ j.neucom.2014.10.093.
[5]
ZHAO S, GAO Y, JIANG X, et al. Exploring principles-of-art features for image emotion recognition[C]. 22nd ACM International Conference on Multimedia, Orlando, FL, USA, 2014: 47-56. doi: 10.1145/2647868.2654930.
[6]
ZHANG H, YANG Z, G?NEN M, et al. Affective abstract image classification and retrieval using multiple kernel learning[C]. 20th International Conference on Neural Information Processing, Daegu, South Korea, 2013: 166-175. doi: 10.1007/978-3-642-42051-1_22.
[7]
ZHANG H, AUGILIUS E, HONKELA T, et al. Analyzing emotional semantics of abstract art using low-level image features[C]. 10th International Symposium on Intelligent Data Analysis, Porto, Portugal, 2011: 413-423. doi: 10.1007/ 978-3-642-24800-9_38.
[8]
LECUN Y, BENGIO Y, and HINTON G. Deep learning[J]. Nature, 2015, 521(7553): 436-444. doi: 10.1038/nature14539.
LI Huanyu, BI Duyan, ZHA Yufei, et al. An easily initialized visual tracking algorithm based on similar structure for convolutional neural network[J]. Journal of Electronics & Information Technology, 2016, 38(1): 1-7. doi: 10.11999/ JEIT150600.
[10]
CHEN T, BORTH D, DARRELL T, et al. Deepsentibank: Visual sentiment concept classification with deep convolutional neural networks[OL]. http://arxiv.org/abs/ 1410.8586v1, 2014.
[11]
YOU Q, LUO J, JIN H, et al. Robust image sentiment analysis using progressively trained and domain transferred deep networks[C]. 29th AAAI Conference on Artificial Intelligence (AAAI), Austin, TX, USA, 2015: 381-388.
LI Zuhe, FAN Yangyu, and WANG Fengqin. Unsupervised feature learning with sparse autoencoders in YUV space[J]. Journal of Electronics & Information Technology, 2016, 38(1): 29-37. doi: 10.11999/JEIT150557.
[13]
ZHANG F, DU B, and ZHANG L. Saliency-guided unsupervised feature learning for scene classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2015, 53(4): 2175-2184. doi: 10.1109/TGRS.2014.2357078.
YANG Xingming, WU Kewei, SUN Yongxuan, et al. Modified covariate-shift multi-source ensemble method in transferability metric[J]. Journal of Electronics & Information Technology, 2015, 37(12): 2913-2920. doi: 10.11999/JEIT150323.
ZHUANG Fuzhen, LUO Ping, HE Qing, et al. Survey on transfer learning research[J]. Journal of Software, 2015, 26(1): 26-39. doi: 10.13328/j.cnki.jos.004631.
[16]
NG A Y, NGIAM J, FOO C Y, et al. Unsupervised feature learning and deep learning[OL]. http://deeplearning.stanford. edu/wiki/index.php, 2015.
[17]
DENG J, ZHANG Z, EYBEN F, et al. Autoencoder-based unsupervised domain adaptation for speech emotion recognition[J]. IEEE Signal Processing Letters, 2014, 21(9): 1068-1072. doi: 10.1109/LSP.2014.2324759.
[18]
YANG X, ZHANG T, and XU C. Cross-domain feature learning in multimedia [J]. IEEE Transactions on Multimedia, 2015, 17(1): 64-78. doi: 10.1109/TMM.2014.2375793.
[19]
ZHOU J T, PAN S J, TSANG I W, et al. Hybrid heterogeneous transfer learning through deep learning[C]. 28th AAAI Conference on Artificial Intelligence (AAAI), Quebec City, QC, Canada, 2014: 2213-2219.
[20]
KOUNO K, SHINNOU H, SASAKI M, et al. Unsupervised domain adaptation for word sense disambiguation using stacked denoising autoencoder[C]. 29th Pacific Asia Conference on Language, Information and Computation (PACLIC 29), Shanghai, China, 2015: 224-231.
[21]
COATES A, LEE H, and NG A Y. An analysis of single-layer networks in unsupervised feature learning[C]. 14th International Conference on Artificial Intelligence and Statistics, Ft. Lauderdale, FL, USA, 2011: 215-223.
[22]
WANG R, DU L, YU Z, et al. Infrared and visible images fusion using compressed sensing based on average gradient[C]. 2013 IEEE International Conference on Multimedia and Expo Workshops (ICMEW), San Jose, CA, USA, 2013: 1-4. doi: 10.1109/ICMEW.2013.6618257.
[23]
L?NGKVIST M and LOUTFI A. Learning feature representations with a cost-relevant sparse autoencoder[J]. International Journal of Neural Systems, 2015, 25(1): 1-11. doi: 10.1142/S0129065714500348.
[24]
LI Z, FAN Y, and LIU W. The effect of whitening transformation on pooling operations in convolutional autoencoders[J]. EURASIP Journal on Advances in Signal Processing, 2015, 2015(1): 1-11. doi: 10.1186/s13634-015- 0222-1.
[25]
VEDALDI A and LENC K. MatConvNet: convolutional neural networks for matlab[C]. 23rd ACM International Conference on Multimedia, Brisbane, Australia, 2015: 689-692. doi: 10.1145/2733373.2807412.