|
|
New Fuzzy-Clustering Algorithm for Data Stream |
Sun Li-juan①② Chen Xiao-dong① Han Chong① Guo Jian①② |
①(College of Computer, Nanjing University of Posts and Telecommunications, Nanjing 210003, China)
②(Jiangsu High Technology Research Key Laboratory for Wireless Sensor Networks, Nanjing University of Posts and Telecommunications, Nanjing 210003, China) |
|
|
Abstract There is a great challenge in the data stream clustering due to a limitation of time and space. In order to solve this problem, a new fuzzy-clustering algorithm, called Weight Decay Streaming Micro Clustering (WDSMC), is presented in this paper. The algorithm uses a reformed weighted Fuzzy C-Means (FCM) algorithm, and improves the quality of clustering by the structures of micro-clusters and weight-decay. Experimental results show that this algorithm has better accuracy than Stream Weight Fuzzy C-Means (SWFCM) and StreamKM++ algorithm.
|
Received: 05 November 2014
Published: 02 June 2015
|
|
Corresponding Authors:
Sun Li-juan
E-mail: sunlj@njupt.edu.cn
|
|
|
|
[1] |
Jonathan A S, Elaine R F, Rodrigo C B, et al.. Data stream clustering: a survey[J]. ACM Computing Surveys, 2013, 46(1):13:1-13:31.
|
[2] |
Shifei D, Fulin W, Jun Q, et al.. Research on data stream clustering algorithms[J]. Artificial Intelligence Review, 2013, 43(4): 593-600.
|
[3] |
Tian Z, Raghu R, and Miron L. BIRCH: an efficient data clustering method for very large databases[C]. Proceedings of the ACM SIGMOD International Conference on Management of Data, New York, USA, 1996: 103-114.
|
[4] |
Aggarwal C C, Han J, and Yu P S. A framework for clustering evolving data streams[C]. Proceedings of the 29th Conference on Very Large Data Bases, Berlin, Germany, 2003: 81-92.
|
[5] |
Chen Y and Tu L. Density-based clustering for real-time stream data[C]. Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, USA, 2007: 133-142.
|
[6] |
Cao F, Ester M, Qian W, et al.. Density-based clustering over an evolving data stream with noise[C]. Proceedings of the 16th SIAM International Conference on Data Mining, Maryland, USA, 2006: 328-339.
|
[7] |
Ackermann M R, M?rtens M, Raupach C, et al.. StreamKM ++: a clustering algorithm for data streams[J]. Journal of Experimental Algorithmics, 2012, 17(1): 2-4.
|
[8] |
Arthur D and Vassilvitskii S. K-means++: the advantages of careful seeding[C]. Proceedings of the 2007 ACM-SIAM Symposium on Discrete Algorithm, New Orleans, USA, 2007: 1027-1035.
|
[9] |
Baraldi A and Blonda P. A survey of fuzzy clustering algorithms for pattern recognition[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 1999, 29(6): 778-785.
|
[10] |
Renxia W, Xiaoya Y, and Xiaoke S. A weighted fuzzy clustering algorithm for data stream[C]. Proceedings of the 2008 ISECS International Colloquium on Computing, Communication, Control, and Management, Guangzhou, China, 2008: 360-364.
|
[11] |
郭躬德, 李南, 陈黎飞. 一种基于混合模型的数据流概念漂移检测算法[J]. 计算机研究与发展, 2014, 51(4): 731-742.
|
|
Guo Gong-de, Li Nan, and Chen Li-fei. Concept drift detection for data stream based on mixture model[J]. Journal of Computer Research and Development, 2014, 51(4): 731-742.
|
[12] |
胡伟. 一种改进的动态k-均值聚类算法[J]. 计算机系统应用, 2013, 22(5): 116-121.
|
|
Hu Wei. Research and realization of a web information extraction and knowledge presentation system[J]. Application of Computer System, 2013, 22(5): 116-121.
|
[13] |
李子柳. 大数据实时流式聚类框架研究[D]. [硕士论文], 中山大学, 2013.
|
|
Li Zi-liu. A framework for real time stream clustering of big data[D]. [Master dissertation], Sun Yat-sen University, 2013.
|
[14] |
Hossein M K, Suhaimi I, and Javad H. Outlier detection in stream data by clustering method[J]. International Journal of Advanced Computer Science and Information Technology, 2013, 2(3): 25-34.
|
[15] |
Jiawei H, Micheline K, Jian P. 范明, 孟小峰. 数据挖掘: 概念与技术[M]. 第3版, 北京: 机械工业出版社, 2012: 323-350.
|
[16] |
David Aha. UCI Machine Learning Repository[OL]. https:// archive.ics.uci.edu/ml, 2014.
|
[17] |
史峰, 王辉, 郁磊, 等. Matlab智能算法: 30个案例分析[M]. 北京: 北京航天航空大学出版社, 2011: 188-196.
|
|
|
|