| 注册
首页|期刊导航|计算机工程与应用|基于改进SMOTE的非平衡数据集分类研究

基于改进SMOTE的非平衡数据集分类研究

王超学 潘正茂 董丽丽 马春森 张星

计算机工程与应用2013,Vol.49Issue(2):184-187,245,5.
计算机工程与应用2013,Vol.49Issue(2):184-187,245,5.DOI:10.3778/j.issn.1002-8331.1109-0145

基于改进SMOTE的非平衡数据集分类研究

Research on classification for imbalanced dataset based on improved SMOTE

王超学 1潘正茂 1董丽丽 1马春森 2张星1

作者信息

  • 1. 西安建筑科技大学信息与控制工程学院,西安710055
  • 2. 中国农业科学院植物保护研究所,北京100193
  • 折叠

摘要

Abstract

Based on analyzing the shortages of SMOTE(Synthetic Minority Over-sampling Technique), an improved SMOTE (SSMOTE) is presented. The key of SSMOTE lies on leading the concept of support and roulette wheel selection into SMOTE and making full use of the heterogeneous nearest-neighbor distribution information to achieve the fine control of the synthesis quality and quantity to the minority class samples. SSMOTE and KNN(A>Nearest Neighbor) are combined to handle the classification problem on imbalanced datasets, and extensive experiments are conducted to compare SSMOTE and algorithms in pertinent literatures on the UCI datasets. The simulation results show SSMOTE promises prominent synthesis effect to the minority class samples, and brings better classification performance on imbalanced datasets with KHH.

关键词

非平衡数据集/分类/支持度/轮盘赌选择/合成少数过采样技术(SMOTE)

Key words

imbalanced datasets/ classification/ support/ roulette wheel selection/ Synthetic Minority Over-sampling Technique (SMOTE)

分类

信息技术与安全科学

引用本文复制引用

王超学,潘正茂,董丽丽,马春森,张星..基于改进SMOTE的非平衡数据集分类研究[J].计算机工程与应用,2013,49(2):184-187,245,5.

基金项目

国家自然科学基金(No.31170393) (No.31170393)

陕西省教育厅自然科学项目(No.2010JK620). (No.2010JK620)

计算机工程与应用

OACSCDCSTPCD

1002-8331

访问量0
|
下载量0
段落导航相关论文