Vector Influence Clustering Coefficient Based Efficient Directed Community Detection Algorithm
DENG Xiaolong① ZHAI Jiayu② YIN Luanyu③
①(Key Laboratory of Trustworthy Distributed Computing and Service of Education Ministry, Beijing University of Posts and Telecommunications, Beijing 100876,China) ②(International School, Beijing University of Posts and Telecommunications, Beijing 100876, China) ③(Academy of Social Management, Beijing Normal University, Beijing 100875, China)
摘要 社团结构划分对于分析复杂网络的统计特性非常重要,以往研究往往侧重对无向网络的社团结构挖掘,对新兴的微信朋友圈网络、微博关注网络等涉及较少,并且缺乏高效的划分工具。为解决传统社团划分算法在大规模有向社交网络上无精确划分模拟模型,算法运行效率低,精度偏差大的问题。该文从构成社团结构最基础的三角形极大团展开数学推导,对网络节点的局部信息传递过程进行建模,并引入概率图有向矢量计算理论,对有向社交网络中具有较大信息传递增益的节点从数学基础创造性地构建了有向传递增益系数(Information Transfer Gain, ITG)。该文以此构建了新的有向社团结构划分效果的目标函数,提出了新型有向网络社团划分算法ITG,通过在模拟网络数据集和真实网络数据集上进行实验,验证了所提算法的精确性和新颖性,并优于FastGN, OSLOM和Infomap等经典算法。
Abstract:Community detection method is significant to character statistics of complex network. Community detection in directed structured network is an attractive research problem while most previous approaches attempt to divide undirected networks into communities while there has appeared many large scale directed social network such as WeChat circle of friends and Sina Micro-Blog. To solve the problem that low quality of model, low efficiency of execution and high deviation of precision from the conventional community detection algorithm on large-scale social network and directed network, this paper provides an approach that starts with the triangle structure of community basis and models the local information transfer to detect community in large-scale directed social network. Basing on the directed vector theory in probability graph and the high information transfer gain of vertex in directed network, this paper constructs the Information Transfer Gain (ITG) method and the corresponding target functions for evaluating the quality of a specific partition in community detection algorithm. Then the combine of ITG with the target function to compose the new community detection algorithm for directed network. Extensive experiments in synthetic signed network and real-life large networks derived from online social media, it is proved that the proposed method is more accurate and faster than several traditional community detection methods such as FastGN, OSLOM and Infomap.
XIE J, KELLEY S, and SZYMANSKI B K. Overlapping community detection in networks: The state-of-the-art and comparative study[J]. ACM Computing Surveys (CSUR), 2013, 45(4): 2-35. doi: 10.1145/2501654.2501657.
[2]
NEWMAN M E J. Fast algorithm for detecting community structure in networks[J]. Physical Review E, 2004, 69(6): 066133. doi: 10.1103/PhysRevE.69.066113.
[3]
CLAUSET A, NEWMAN M E J, and MOORE C. Finding community structure in very large networks[J]. Physical Review E, 2004, 70(6): 066111. doi: 10.1103/PhysRevE.70. 066111.
[4]
BLONDEL V D, GUILLAUME J L, LAMBIOTTE R, et al. Fast unfolding of communities in large networks[J]. Journal of Statistical Mechanics: Theory and Experiment, 2008, 2008(10): 1-12. doi: 10.1088/1742-5468/2008/10/P10008.
[5]
PRAT-PÉREZ A, DOMINGUEZ-SAL D, and LARRIBA- PEY J L. High quality, scalable and parallel community detection for large real graphs[C]. The 23rd International Conference on World Wide Web, ACM, Seoul, Korea 2014: 225-236. doi: 10.1145/2566486.2568010.
[6]
ZHU X, GHAHRAMANI Z, and LAFFERTY J. Semi- supervised learning using Gaussian fields and harmonic functions[C]. International Conference on Machine Learning, Washington D.C., US, 2003, 3: 912-919.
[7]
RAGHAVAN U N, ALBERT R, and KUMARA S. Near linear time algorithm to detect community structures in large-scale networks[J]. Physical Review E, 2007, 76(3): 036106. doi: 10.1103/PhysRevE.76.036106.
[8]
PONS P and LATAPY M. Computing communities in large networks using random walks[C]. International Symposium on Computer and Information Sciences. Springer Berlin Heidelberg, Krakow, Poland, 2005: 284-293. doi: 10.1007/ 11569596_31.
[9]
ROSVALL M and BERGSTROM C T. Maps of random walks on complex networks reveal community structure[J]. Proceedings of the National Academy of Sciences, 2008, 105(4): 1118-1123. doi: 10.1073/pnas.0706851105.
[10]
LANCICHINETTI A and FORTUNATO S. Community detection algorithms: A comparative analysis[J]. Physical Review E, 2009, 80(5): 056117. doi: 10.1103/PhysRevE.80. 056117.
[11]
PALLA G, DERÉNYI I, FARKAS I, et al. Uncovering the overlapping community structure of complex networks in nature and society[J]. Nature, 2005, 435(7043): 814-818. doi: 10.1038/nature03607.
[12]
AHN Y Y, BAGROW J P, and LEHMANN S. Link communities reveal multi scale complexity in networks[J]. Nature, 2010, 466(7307): 761-764. doi: 10.1038/nature09182.
[13]
LANCICHINETTI A, RADICCHI F, RAMASCO J J, et al. Finding statistically significant communities in networks[J]. PloS One, 2011, 6(4): e18961. doi: 10.1371/journal.pone. 0018961.
[14]
YANG J and LESKOVEC J. Overlapping community detection at scale: A nonnegative matrix factorization approach[C]. The Sixth ACM International Conference on Web Search and Data Mining. ACM, Rome, Italy, 2013: 587-596. doi: 10.1145/2433396.2433471.
[15]
NEWMAN M E J and CLAUSET A. Structure and inference in annotated networks[J]. Nature Communications, 2016,7: 11863. doi: 10.1038/ncomms11863.
[16]
KOLLER D and FRIEDMAN N. Probabilistic Graphical Models: Principles and Techniques[M]. Massachusetts USA, MIT Press, 2009: 1-5.
[17]
PRAT-PÉREZ A, DOMINGUEZ-SAL D, BRUNAT J M, et al. Shaping communities out of triangles[C]. The 21st ACM International Conference on Information and Knowledge Management, ACM, 2012: 1677-1681. doi: 10.1145/2396761. 2398496.
[18]
LEVORATO V and PETERMANN C. Detection of communities in directed networks based on strongly p-connected components[C]. IEEE 2011 International Conference on Computational Aspects of Social Networks (CASoN), Salamanca, Spain, 2011: 211-216. doi: 10.1109/ CASON.2011.6085946.
[19]
ARENAS A, DUCH J, FERNÁNDEZ A, et al. Size reduction of complex networks preserving modularity[J]. New Journal of Physics, 2007, 9(6): 1-14. doi: 10.1088/1367-2630/9/6/176.