计算机应用与软件2012,Vol.29Issue(3):91-93,162,4.
基于主动学习SMOTE的非均衡数据分类
ACTIVE LEARNING SMOTE BASED IMBALANCED DATA CLASSIFICATION
摘要
Abstract
Synthetic Minority Over-sampling Technique ( SMOTE) is a typical over-sampling data preprocessing method which can effectively balance the imbalanced data. However, it brings about noise as well as other problems, so that the classification accuracy is downgraded. To solve the problem, with the help of the classification performance of active learning SVM, an imbalance data classification approach, called ALSMOTE, which is based on active learning SMOTE, is proposed. Since active learning SVM relies on distance-based active selection optimal samples learning strategies, it can actively choose from imbalanced data the valuable majority class samples by discarding valueless samples, so as to enhance operational efficiency and mitigate the problems brought about by SMOTE. First of all SMPTE approach is used to balance a small part of samples to obtain an initial classification; then active learning strategies are followed to adjust the classification accuracy. Experimental results show that the proposed method can effectively improve the imbalanced data' s classification accuracy.关键词
主动学习/不平衡数据集/少数类样本合成过采样技术/支持向量机Key words
Active learning/Imbalanced data set/SMOTE/SVM分类
信息技术与安全科学引用本文复制引用
张永,李卓然,刘小丹..基于主动学习SMOTE的非均衡数据分类[J].计算机应用与软件,2012,29(3):91-93,162,4.基金项目
国家自然科学基金项目(10771092) (10771092)
辽宁省科技厅博士启动基金项目(20081079) (20081079)
大连市科学技术基金项目(2010J21DW019). (2010J21DW019)