Clustering Algorithms for Large-scale Social Networks Based on Structural Similarity
Chen Ji-meng① Chen Jia-jun② Liu Jie① Huang Ya-lou② Wang Yuan① Feng Xia③
①(College of Computer and Control Engineering, Nankai University, Tianjin 300071, China) ②(College of Software, Nankai University, Tianjin 300071, China) ③(Information Technology Research Base of CAAC, Civil Aviation University of China, Tianjin 300300, China)
Abstract:To cluster the directed and large-scale social networks, a Structural Clustering Algorithm for Directed Networks (DirSCAN) and a corresponding Parallel algorithm (PDirSCAN) are proposed. Considering oriented behavioral relation between two vertices, DirSCAN is constructed based on action structural similarity and function analysis. To meet the need of large-scale social network analysis, a lossless PDirSCAN based on MapReduce distributed parallel architecture is designed to improve the processing performance. A large number of experimental results on real-world network datasets show that DirSCAN improves performance of SCAN up to 2.34% on F1, PDirSCAN runs 1.67 times faster than DirSCAN.