自动化学报Issue(6):1075-1085,11.DOI:10.3724/SP.J.1004.2014.01075
多标签代价敏感分类集成学习算法
Cost-sensitive Ensemble Learning Algorithm for Multi-label Classification Problems
摘要
Abstract
Although a multi-label classification problem can be converted into a multi-class classification problem to solve, it is diffcult that a multi-label cost-sensitive classification problem is converted into a multi-class cost-sensitive classification problem. A cost-sensitive ensemble learning algorithm for multi-label classification problems is proposed based on the analysis on the problems encountered when the multi-class cost-sensitive learning algorithm being extended to multi-label cost-sensitive learning algorithms. The average misclassification cost of the algorithm is composed of fall-out cost and the omission cost. The new algorithm0s process is similar to the adaptive boosting (AdaBoost) algorithm, and the algorithm can automatically learn some weak classifiers and combine them into a strong classifier, and the average misclassification cost of the strong classifier will decrease as the weak classifiers gradually increase. The distinction between the cost-sensitive ensemble learning algorithm for multi-label classification problems and the cost-sensitive AdaBoost algorithm for multi-class classification problems is analyzed in detail, including the basis of output label and the meaning of the misclassification cost. Unlike general multi-class cost-sensitive classification problems, the misclassification cost of the multi-label cost-sensitive classification problems are subject to certain restrictions, and the specific restrictions are given. A multi-label AdaBoost algorithm and a multi-class cost-sensitive AdaBoost algorithm can be obtained by simplifying the proposed algorithm. Theoretical analysis and experimental results show that the proposed multi-label cost-sensitive classification ensemble learning algorithm is effective, and that the algorithm can minimize the average misclassification cost. In particular, when the difference of costs of the classes is large, the proposed algorithm can get better results than the existing multi-class cost-sensitive AdaBoost algorithms.关键词
多标签分类/代价敏感学习/集成学习/自适应提升算法/多分类Key words
Multi-label classification/cost-sensitive learning/ensemble learning/adaptive boosting (AdaBoost) algo-rithm/multi-class classification引用本文复制引用
付忠良..多标签代价敏感分类集成学习算法[J].自动化学报,2014,(6):1075-1085,11.基金项目
四川省科技支撑计划(2011GZ0171,2012GZ0106)资助@@@@Supported by the Key Technology Research and Develop-ment Program of Sichuan Province of China (2011GZ0171,2012GZ0106) (2011GZ0171,2012GZ0106)