|
|
Recognizing Users’ Focuses on Social Network Based on Mixed-weight Combined Strategy |
JI Jianrui LIU Yezheng JIANG Yuanchun |
(School of Management, Hefei University of Technology, Hefei 230009, China)
(Key Laboratory of Process Optimization and Intelligent Decision-making, Ministry of Education, Hefei 230009, China) |
|
|
Abstract It is an important measure to utilize the topic model to recognize the users’ focuses on social networks, such as blog, online community, and microblog. Considering the particularity of topic recognizing of short texts on the social network platform, this paper develops an AW-LDA model based on mixed-weight combined strategy according to the relevance of short texts’ context. This model virtually combines short texts, which are in line with contextual-related conditions, and endows different short texts with different weights according to the related extent. It proposes a new method of recognizing short texts’ topics. According to the experiments on data of BBS and Weibo communities, the results show that the model can effectively recognize social network users’ focuses on different subjects and it proposes a new idea about solving the topic recognition problem of short texts.
|
Received: 09 December 2016
Published: 14 June 2017
|
|
Fund:The National Natural Science Foundation of China (71490725, 71521001, 71371062, 91546114, 71501057), The National 973 Program of China (2013CB329603), The National Key Technology Support Program (2015BAH26F00), MOE Project of Humanities and Social Sciences (15YJC630111) |
Corresponding Authors:
JIANG Yuanchun
E-mail: ycjiang@hfut.edu.cn
|
|
|
|
[1] |
YAN Zehua and LI Fang. News thread extraction based on topical n-gram model with a background distribution[C]. International Conference on Neural Information Processing, Berlin, 2011: 416-424. doi: 10.1007/978-3-642-24958-7_49.
|
[2] |
XING Chen, WANG Yuan, LIU Jie, et al. Hash tag-based sub- event discovery using mutually generative LDA in Twitter[C]. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, 2016: 2666-2672.
|
[3] |
ZHANG Xiaoming, CHEN Xiaoming, CHEN Yan, et al. Event detection and popularity prediction in microblogging [J]. Neurocomputing, 2015, 149(3): 1469-1480. doi: 10.1016/ j.neucom.2014.08.045.
|
[4] |
BLEI D, NG A, and JORDAN M. Latent dirichlet allocation [J]. Journal of Machine Learning Research, 2003, (3): 993-1022.
|
[5] |
WENG Jianshu, LIM E, JIANG Jing, et al. Twitterrank: Finding topic-sensitive influential twitterers[C]. Proceedings of the Third ACM International Conference on Web Search and Data Mining, New York, 2010: 261-270. doi: 10.1145/ 1718487.1718520.
|
[6] |
PHAN X, NGUYEN L, and HORIGUCHI S. Learning to classify short and sparse text & web with hidden topics from large-scale data collections[C]. Proceedings of the 17th International Conference on World Wide Web, Beijing, 2008: 91-100. doi: 10.1145/1367497.1367510.
|
[7] |
ZHANG Heng and ZHONG Guoqiang. Improving short text classification by learning vector representations of both words and hidden topics[J]. Knowledge-Based Systems, 2016, 102(12): 76-86. doi: 10.1016/j.knosys.2016.03.027.
|
[8] |
VO D and OCK C. Learning to classify short text from scientific documents using topic models with various types of knowledge[J]. Expert Systems with Applications, 2015, 42(3): 1684-1698. doi: 10.1016/j.eswa.2014.09.031.
|
[9] |
JIN O, LIU N, ZHAO Kai, et al. Transferring topical knowledge from auxiliary long texts for short text clustering [C]. Proceedings of the 20th ACM International Conference on Information and Knowledge Management, New York, 2011: 775-784. doi: 10.1145/2063576.2063689.
|
[10] |
CHENG Xueqi, YAN Xiaohui, LAN Yanyan, et al. Btm: Topic modeling over short texts[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(12): 2928-2941. doi: 10.1109/TKDE.2014.2313872.
|
[11] |
ZUO Yuan, WU Junjie, ZHANG Hui, et al. Topic modeling of short texts: A pseudo-document view[C]. Proceedings of the 22nd ACM international Conference on Knowledge Discovery and Data Mining, San Francisco, 2016: 2105-2114. doi: 10.1145/2939672.2939880.
|
[12] |
LIN Hao, SUN Bo, WU Junjie, et al. Topic detection from short text: A term-based consensus clustering method[C]. Proceedings of the 13th International Conference on Service Systems and Service Management, Kunming, 2016: 1-6. doi: 10.1109/ICSSSM.2016.7538624.
|
[13] |
ZHAO Waynexin, JIANG Jing, WENG Jianshu, et al. Comparing twitter and traditional media using topic models[C]. Proceedings of the 33rd European Conference on Information Retrieval, Dublin, 2011: 338-349. doi: 10.1007/ 978-3-642-20161-5_34.
|
[14] |
张华平. NLPIR汉语分词系统[OL]. http://ictclas.nlpir.org/, 2016.3.
|
[15] |
MIMNO D, WALLACH H, TALLEY E, et al. Optimizing semantic coherence in topic models[C]. Proceedings of the Conference on Empirical Methods in Natural Language Processing, Edinburgh, 2011: 262-272.
|
|
|
|