Word similarity measurement plays an important role in machine learning, information retrieval and many other fields. Regarding the concept primitive symbol system of Hierarchical network of concepts theory as semantic resource and comparing commonness with difference, a multi-dimensional computational method for similarity is proposed which considers the hierarchy, netted nature, comparability and duality, attached feature and quintuple information of the system. Weight strategy is introduced for node depth and distance measurement to increase the discrimination of node level. Experiment on manual scoring test set shows that the computed similarities are consistent with human judgments. The proposed method achieves 0.812, 0.786, and 0.775 in compatibility degree, correlation coefficient, and ordinal pair conformity respectively. Meanwhile, the result of correlation test further proofs that the computed similarities and human’s scores are significantly correlated.
LIN D. An information-theoretic definition of similarity semantic distance in WordNet[C]. Proceedings of the 15th International Conference on Machine Learning, San Francisco, CA, USA, 1998: 296-304.
[2]
WU Z and PALMER M. Verbs semantics and lexical selection [C]. Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, Stroudsburg, PA, USA, 1994: 133-138. doi: 10.3115/981732.981751.
[3]
RESNIK P. Semantic similarity in a taxonomy: an information based measure and its application to problems of ambiguity in natural language[J]. Journal of Artificial Intelligence Research, 1999, 11(7): 95-130. doi: 10.1613/jair. 514.
WANG Tong, WANG Lei, WU Jiyi, et al. Semantic similarity calculation method of Comprehensive concept in WordNet[J]. Journal of Beijing University of Posts and Telecommunications, 2013, 36(2): 98-101. doi: 10.13190/ jbupt.201302.98.wangt.
[5]
WANG Junhua, ZUO Wanli, and PENG Tao. Hyponymy graph model for word semantic similarity measurement[J]. Chinese Journal of Electronics, 2015, 24(1): 96-101. doi: 10.1049/cje.2015.01.016.
LIU Qun and LI Sujian. Words semantic similarity computation based on HowNet[C]. Proceedings of the 3rd Chinese Lexical Semantics Workshop, Taipei, China, 2002: 59-76.
LI Guojia. Chinese words similarity computation based on HowNet[J]. Intelligent Computer and Applications, 2015, 5(3): 49-52. doi: 10.3969/j.issn.2095-2163.2015.03.015.
ZHANG Huyin, LIU Daobo, and WEN Chunyan. Research on improved algorithm of word semantic similarity based on HowNet[J]. Computer Engineering, 2015, 41(2): 151-156. doi: 10.3969/j.issn.1000-3428.2015.02.029.
SUN Jing and ZHANG Dongzhan. Word similarity computing based on inverse concept frequencies[J]. Journal of Xiamen University (Natural Science), 2015, 54(2): 257-262. doi: 10.6043/j.issn.0438-0479.2015.02.018.
[10]
BROWN P, PIETRA S, PIETRA V, et al. Word sense disambiguation using statistical methods[C]. Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics, Berkeley, CA, USA, 1991: 264-270. doi: 10.3115/981344.981378.
GUAN Yi and WANG Xiaolong. A statistical measure of semantic similarity between Chinese words[C]. Proceedings of the 7th Joint Symposium on Computational Linguistics, Harbin, China, 2003: 221-227.
WANG Shi, CAO Cungen, PEI Yajun, et al. A collocation based method for semantic similarity measure for Chinese words[J]. Journal of Chinese Information Processing, 2013, 27(1): 7-14. doi: 10.3969/j.issn.1003-0077.2013.01.002.
LI Hui. A review on the research of word similarity algorithms[J]. Journal of Modern Information, 2015, 35(4): 172-177. doi: 10.3969/j.issn.1008-0821.2015.04.035.
[14]
黄曾阳. HNC理论全书(第五册)[M]. 北京: 科学出版社, 2015: 1-102.
HUANG Zengyang. The Complete Book of Hierarchical Network of Concepts Theory (Book 5)[M]. Beijing: Science Press, 2015: 1-102.
[15]
苗传江. HNC(概念层次网络)理论导论[M]. 北京: 清华大学出版社, 2005: 1-49.
MIAO Chuanjiang. Introduction to HNC Theory[M]. Beijing: Tsinghua University Press, 2005: 1-49.
WU Zuoyan and WANG Yu. A new measure of semantic similarity based on hierarchical network of concepts[J]. Journal of Chinese Information Processing, 2014, 28(2): 37-43. doi: 10.3969/j.issn.1003-0077.2014.02.005.
SHI Yan. The research on Chinese sentence similarity algorithm based on HNC[D]. [Master dissertation], Jiangsu University, 2009: 14-19. doi: 10.7666/d.y1604350.