<em>v</em>-软间隔罗杰斯特回归分类机

doi:10.11999/JEIT150769

摘要
图/表
参考文献(21)
相关文章 (2)

全文: PDF (1019 KB)
输出: BibTeX | EndNote (RIS)

摘要

坐标下降(Coordinate Descent, CD)方法是求解大规模数据分类问题的有效方法，具有简单操作流程和快速收敛速率。为了提高罗杰斯特回归分类器(Logistic Regression Classifier, LRC)的泛化性能，受v-软间隔支持向量机的启发，该文提出一种v-软间隔罗杰斯特回归分类机(v-Soft Margin Logistic Regression Classifier, v-SMLRC)，证明了v-SMLRC对偶为一等式约束对偶坐标下降CDdual并由此提出了适合于大规模数据的v-SMLRC-CDdual。所提出的v-SMLRC-CDdual既能最大化类间间隔，又能有效提高LRC的泛化性能。大规模文本数据集实验表明，v-SMLRC-CDdual分类性能优于或等同于相关方法。

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	黄成泉
	王士同
	蒋亦樟
	董爱美

关键词 ：罗杰斯特回归, 泛化, 坐标下降, 对偶坐标下降

Abstract：

Coordinate Descent (CD) is a promising method for large scale pattern classification issues with straightforward operation and fast convergence speed. In this paper, inspired by v-soft margin Support Vector Machine (v-SVM) for pattern classification, a new v-Soft Margin Logistic Regression Classifier (v-SMLRC) is proposed for pattern classification to enhance the generalization performance of Logistic Regression Classifier (LRC). The dual of v-SMLRC can be regarded as CDdual problem with equality constraint and then a new large scale pattern classification method called v-SMLRC-CDdual is proposed. The proposed v-SMLRC-CDdual can maximize the inner-class margin and effectively enhance the generalization performance of LRC. Empirical results conducted with large scale document datasets demonstrate that the proposed method is effective and comparable to other related methods.

Key words： Logistic regression Generalization Coordinate Descent (CD) Dual Coordinate Descent (CDdual)

收稿日期: 2015-06-29 出版日期: 2016-01-22

PACS:

TP391.4

基金资助:

国家自然科学基金(61272210, 61202311)，江苏省自然科学基金(BK2012552)，贵州省科学技术基金(黔科合J字[2013]2136号)

通讯作者: 黄成泉：男，1976年生，教授，博士生，研究方向为模式识别、数据挖掘和机器学习等. E-mail: hcq863@163.com

作者简介: 黄成泉：男，1976年生，教授，博士生，研究方向为模式识别、数据挖掘和机器学习等. 王士同：男，1964年生，教授，博士生导师，研究方向为模式识别、人工智能、数据挖掘和模糊系统等. 蒋亦樟：男，1988年生，博士生，研究方向为模式识别、智能计算及其应用等. 董爱美：女，1978年生，讲师，博士生，研究方向为模式识别、人工智能和机器学习等.

引用本文:

黄成泉,王士同,蒋亦樟,董爱美. v-软间隔罗杰斯特回归分类机[J]. 电子与信息学报, 2016, 38(4): 985-992. HUANG Chengquan, WANG Shitong, JIANG Yizhang, DONG Aimei. v-Soft Margin Logistic Regression Classifier. JEIT, 2016, 38(4): 985-992.

链接本文:

http://jeit.ie.ac.cn/CN/10.11999/JEIT150769 或 http://jeit.ie.ac.cn/CN/Y2016/V38/I4/985

[1]	BOTTOU L and BOUSQUET O. The tradeoffs of large scale learning[C]. Proceedings of Advances in Neural Information Processing Systems, Cambridge, 2008: 151-154.
[2]	LIN C Y, TSAI C H, LEE C P, et al. Large-scale logistic regression and linear support vector machines using Spark[C]. Proceedings of 2014 IEEE International Conference on Big Data, Washington DC, 2014: 519-528. doi: 10.1109/BigData. 2014.7004269.
[3]	AGERRI R, ARTOLA X, BELOKI Z, et al. Big data for natural language processing: A streaming approach[J]. Knowledge-Based Systems, 2015, 79: 36-42. doi: 10.1016/ j.knosys.2014.11.007.
[4]	DARROCH J N and RATCLIFF D. Generalized iterative scaling for log-linear models[J]. The Annals of Mathematical Statistics, 1972, 43(5): 1470-1480. doi: 10.1214/aoms/ 1177692379.
[5]	DELLA P S, DELLA P V, and LAFFERTY J. Inducing features of random fields[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, 19(4): 380-393. doi: 10.1109/34.588021.
[6]	GOODMAN J. Sequential conditional generalized iterative scaling[C]. Proceedings of the 40th annual meeting of the association of computational linguistics, Philadelphia, 2002: 9-16. doi: 10.3115/1073083.1073086.
[7]	JIN R, YAN R, ZHANG J, et al. A faster iterative scaling algorithm for conditional exponential model[C]. Proceedings of the 20th International Conference on Machine Learning, New York, 2003: 282-289.
[8]	HUANG F L, HSIEN C J, CHANG K W, et al. Iterative scaling and coordinate descent methods for maximum entropy[J]. Journal of Machine Learning Research, 2010, 11(2): 815-848.
[9]	MINKA T P. A comparison of numerical optimizers for logistic regression[OL]. http://research.microsoft.com/en-us/ um/people/minka/papers/logreg/minka-logreg.pdf. 2007.
[10]	KOMAREK P and MOORE A W. Making logistic regression a core data mining tool: a practical investigation of accuracy, speed, and simplicity[R]. Technical report TR-05-27, Robotics Institute of Carnegie Mellon University, Pittsburgh, 2005.
[11]	LIN C J, WENG R C, and KEERTHI S S. Trust region Newton method for large-scale logistic regression[J]. Journal of Machine Learning Research, 2008, 9(4): 627-650.
[12]	KEERTHI S S, DUAN K B, SHEVADE S K, et al. A fast dual algorithm for kernel logistic regression[J]. Machine Learning, 2005, 61(1-3): 151-165. doi: 10.1007/s10994- 005-0768-5.
[13]	PLATT J C. Fast training of support vector machines using sequential minimal optimization[C]. Proceedings of Advances in Kernel Methods: Support Vector Learning, Cambridge, 1999: 185-208.
[14]	YU H F, HUANG F L, and LIN C J. Dual coordinate descent methods for logistic regression and maximum entropy models[J]. Machine Learning, 2011, 85(1/2): 41-75. doi: 10.1007/s10994-010-5221-8.
[15]	顾鑫, 王士同, 许敏. 基于多源的跨领域数据分类快速新算法[J]. 自动化学报, 2014, 40(3): 531-547. doi: 10.3724/SP.J. 1004.2014.00531.
	GU X, WANG S T, and XU M. A new cross-multidomain classification algorithm and its fast version for large datasets[J]. Acta Automatica Sinica, 2014, 40(3): 531-547. doi: 10.3724/SP.J.1004.2014.00531.
[16]	顾鑫, 王士同. 大样本多源域与小目标域的跨领域快速分类学习[J]. 计算机研究与发展, 2014, 51(3): 519-535. doi: 10.7544/issn1000-1239.2014.20120652.
	GU X and WANG S T. Fast cross-domain classification method for large multisources/small target domains[J]. Journal of Computer Research and Development, 2014, 51(3): 519-535. doi: 10.7544/issn1000-1239.2014.20120652.
[17]	张学峰, 陈渤, 王鹏辉, 等. 一种基于Dirichelt过程隐变量支撑向量机模型的目标识别方法[J]. 电子与信息学报, 2015, 37(1): 29-36. doi: 10.11999/JEIT140129.
	ZHANG X F, CHEN B, WANG P H, et al. A target recognition method based on dirichlet process latent variable support vector machine model[J]. Journal of Electronics & Information Technology, 2015, 37(1): 29-36. doi: 10.11999/ JEIT140129.
[18]	及歆荣, 侯翠琴, 侯义斌. 无线传感器网络下线性支持向量机分布式协同训练方法研究[J]. 电子与信息学报, 2015, 37(3): 708-714. doi: 10.11999/JEIT140408.
	JI X R, HOU C Q, and HOU Y B. Research on the distributed training method for linear SVM in WSN[J]. Journal of Electronics & Information Technology, 2015, 37(3): 708-714. doi: 10.11999/JEIT140408.
[19]	高发荣, 王佳佳, 席旭刚, 等. 基于粒子群优化-支持向量机方法的下肢肌电信号步态识别[J]. 电子与信息学报, 2015, 37(5): 1154-1159. doi: 10.11999/JEIT141083.
	GAO F R, WANG J J, XI X G, et al. Gait recognition for lower extremity electromyographic signals based on PSO- SVM method[J]. Journal of Electronics & Information Technology, 2015, 37(5): 1154-1159. doi: 10.11999/ JEIT141083.
[20]	HSIEH C J, CHANG K W, LIN C J, et al. A dual coordinate descent method for large-scale linear SVM[C]. Proceedings of the 25th International Conference on Machine Learning, New York, 2008: 408-415. doi: 10.1145/1390156.1390208.
[21]	CHEN P H, LIN C J, and SCHÖLKOPF B. A tutorial on v-support vector machines[J]. Applied Stochastic Models in Business and Industry, 2005, 21(2): 111-136. doi: 10.1002/ asmb.537.
[22]	PENG X J, CHEN D J, and KONG L Y. A clipping dual coordinate descent algorithm for solving support vector machines[J]. Knowledge-Based Systems, 2014, 71: 266-278. doi: 10.1016/j.knosys.2014.08.005.
[23]	TSAI C H, LIN C Y, and LIN C J. Incremental and decremental training for linear classification[C]. Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, 2014: 343-352. doi: 10.1145/2623330.2623661.