| 注册
首页|期刊导航|计算机应用研究|基于AdaBoost的类不平衡学习算法

基于AdaBoost的类不平衡学习算法

秦孟梅 邱建林 陆鹏程 陈璐璐 赵伟康

计算机应用研究2017,Vol.34Issue(11):3229-3232,3254,5.
计算机应用研究2017,Vol.34Issue(11):3229-3232,3254,5.DOI:10.3969/j.issn.1001-3695.2017.11.006

基于AdaBoost的类不平衡学习算法

AdaBoost-based class imbalance learning algorithm

秦孟梅 1邱建林 2陆鹏程 3陈璐璐 1赵伟康1

作者信息

  • 1. 南通大学电子信息学院,江苏南通226019
  • 2. 南通大学计算机科学与技术学院,江苏南通226019
  • 3. 南通理工学院计算机与信息工程学院,江苏南通226002
  • 折叠

摘要

Abstract

When dealing with unbalanced data sets,the borderline examples of the minority class are more easily misclassifled.To reduce the impact of class imbalanced about the performance of classifier,this paper presented an adaptive borderlineSMOTE (AB-SMOTE) algorithm.AB-SMOTE algorithm sampled the boundary samples of the minority adaptively,which improved the degree of balance and efficiency of the data sets.At the same time,the AB-SMOTE algorithm was combined with the data cleaning technology to form a new ensemble algorithm ABTAdaBoost based on AdaBoost.ABTAdaBoost algorithm consisted of three stages.In the first stage,the training data sets adopted AB-SMOTE algorithm to reduce the degree of imbalance of data sets;in the second stage,it used Tomek links data cleaning techniques to remove the noise and overlapping instances which were introduced from sampling methods in the data sets,the availability of data was improved at the same time;in the third stage,it used the AdaBoost algorithm to generate an ensemble classifier based on N weak classifier.Experiment used J48 decision tree and naive Bayes as the base classifier respectively.The results show that ABTAdaBoost algorithm has the best overall performance compared with other algorithms in 12 UCI data sets.

关键词

机器学习/类不平衡学习/集成学习/SMOTE/数据清理技术

Key words

machine learning/class imbalance learning/ensemble learning/SMOTE/data cleaning techniques

分类

信息技术与安全科学

引用本文复制引用

秦孟梅,邱建林,陆鹏程,陈璐璐,赵伟康..基于AdaBoost的类不平衡学习算法[J].计算机应用研究,2017,34(11):3229-3232,3254,5.

基金项目

国家自然科学基金资助项目(NSF61202006/61272424) (NSF61202006/61272424)

计算机软件新技术国家重点实验室开放课题(KFKT2012B29) (KFKT2012B29)

江苏省自然科学基金资助项目(BK2010277) (BK2010277)

江苏省科技创新基金资助项目(BC2013167) (BC2013167)

计算机应用研究

OA北大核心CSCDCSTPCD

1001-3695

访问量0
|
下载量0
段落导航相关论文