Abstract:T-distributed Stochastic Neighbor Embedding (TSNE) is introduced into cluster ensemble problem and a cluster ensemble approach based on TSNE is proposed. First, TSNE is utilized to minimize Kullback-Leibler divergences between the high-dimensinal points corresponding to the rows of hypergraph’s adjacent matrix and the low-dimensional mapping points, which preserves the structure of high-dimensional space in low-dimensional space. Then, a hierarchical clustering algorithm is carried out in the low-dimensional space to obtain the final clustering result. Experimental results on several baseline datasets indicate that TSNE can improve the cluster results of hierarchical clustering algorithm and the proposed cluster ensemble method via TSNE outperforms state-of-the-art methods.
WANG Xiaofeng, LIU Gongshen, and LI Jianhua. Multiresolution community detection based on fuzzy clustering[J]. Journal of Electronics & Information Technology, 2017, 39(9): 2033-2039. doi: 10.11999/JEIT 161116.
[4]
STREHL A and GHOSH J. Cluster ensembles: A knowledge reuse framework for combining multiple partitions[J]. Journal of Machine Learning Research, 2002, 3: 583-617.
LUO Huilan, KONG Fansheng, and LI Yixiao. An analysis of diversity measures in clustering ensembles[J]. Chinese Journal of Computers, 2007, 30(8): 1315-1323.
[7]
WU Junjie, LIU Hongfu, XIONG Hui, et al. K-means based consensus clustering: A unified view[J]. IEEE Transactions on Knowledge and Data Engineering, 2015, 27(1): 155-169. doi: 10.1109/TKDE.2014.2316512.
[8]
FRED A and LOURENGO A. Cluster ensemble methods: From single clusterings to combined solutions[J]. Studies in Computational Intelligence, 2008, 126(1): 3-30.
[9]
XU Sen, CHAN Kungsic, Gao Jun, et al. An integrated K-means?Laplacian cluster ensemble approach for document datasets[J]. Neurocomputing, 2016, 214(6): 495-507. doi: 10.1016/j.neucom.2016.06.034.
[10]
YU Zhiwen, LI Le, LIU Jiming, et al. Adaptive noise immune cluster ensemble using affinity propagation[J]. IEEE Transactions on Knowledge and Data Engineering, 2015, 27(12): 3176-3189. doi: 10.1109/TKDE.2015.2453162.
CHU Ruihong, WANG Hongjun, YANG Yan, et al. Clustering ensemble based on density peaks[J]. Acta Automatica Sinica, 2016, 42(9): 1401-1412. doi: 10.16383/ j.aas.2016.c150864.
[12]
BERIKOV V and PESTUNOV I. Ensemble clustering based on weighted co-association matrices: Error bound and convergence properties[J]. Pattern Recognition, 2017, 63: 427-436. doi: 10.1016/j.patcog.2016.10.017.
[13]
MAATEN L V D and HINTON G. Visualizing data using t-SNE[J]. Journal of Machine Learning Research, 2008, 9(11): 2579-2605.
[14]
MAATEN L V D. Learning a parametric embedding by preserving local structure[C]. Proceedings of the 12th International Conference on Artificial Intelligence and Statistics, Clearwater Beach, Florida, USA, 2009: 384-391.
[15]
MAATEN L V D. Accelerating t-SNE using tree-based algorithms[J]. Journal of Machine Learning Research, 2014, 15(1): 3221-3245.
[16]
SALTON G and BUCKLEY C. Term-weighting approaches in automatic text retrieval[J]. Information Processing and Management, 1998, 24(5): 513-523.
[17]
FERN X Z and LIN W. Cluster ensemble selection[J]. Statistical Analysis & Data Mining, 2008, 1(3): 128-141.
[18]
ZHAO Xingwang, LIANG Jiye, and DANG Chuangyin. Clustering ensemble selection for categorical data based on internal validity indices[J]. Pattern Recognition, 2017, 69(4): 150-168. doi: 10.1016/j.patcog.2017.04.019.