电子学报2018,Vol.46Issue(1):135-144,10.DOI:10.3969/j.issn.0372-2112.2018.01.019
基于三支决策的不平衡数据过采样方法
An Oversampling Method for Imbalance Data Based on Three-Way Decision Model
摘要
Abstract
Sampling is an effective way to solve the problem of unbalanced data classification.According to the distribution of samples,we employ the three-way decision model to divide the universe into three parts:positive region,boundary region and negative region.After that,we oversample the minority class samples in boundary region and negative region respectively.Then,a novel oversampling algorithm for imbalance data based on three-way decision model,namely TWDIDOS,is developed.The experimental results show that the proposed method can effectively solve the two-class classification problems of imbalanced data and has a better performance in such measures (Recall、F-value、AUC) on C45,KNN and CART classifiers than other oversampling methods.关键词
三支决策/邻域粗糙集/边界采样/不平衡数据/SMOTEKey words
three-way decision/neighborhood rough set/boundary sampling/imbalanced data/SMOTE分类
信息技术与安全科学引用本文复制引用
胡峰,王蕾,周耀..基于三支决策的不平衡数据过采样方法[J].电子学报,2018,46(1):135-144,10.基金项目
国家自然科学基金(No.61309014,No.61379114,No.61472056) (No.61309014,No.61379114,No.61472056)
教育部人文社科规划(No.15XJA630003) (No.15XJA630003)
重庆市基础与前沿研究计划(No.cstc2013jcyjA40063,No.cstc2014jcyjA40049) (No.cstc2013jcyjA40063,No.cstc2014jcyjA40049)
重庆市教委科学技术研究(No.KJ1500416) (No.KJ1500416)