| 注册
首页|期刊导航|计算机工程与应用|基于混合采样的不平衡数据集算法研究

基于混合采样的不平衡数据集算法研究

张明 胡晓辉 吴嘉昕

计算机工程与应用2019,Vol.55Issue(17):68-75,8.
计算机工程与应用2019,Vol.55Issue(17):68-75,8.DOI:10.3778/j.issn.1002-8331.1804-0307

基于混合采样的不平衡数据集算法研究

Imbalanced Data Processing Algorithm Based on Mixed Sampling

张明 1胡晓辉 1吴嘉昕1

作者信息

  • 1. 兰州交通大学 电子与信息工程学院,兰州 730070
  • 折叠

摘要

Abstract

Aiming to solve the poor performance of imbalanced datasets classification, a novel imbalanced datasets classification algorithm based on mixed sampling(BSI)is proposed. This method firstly introduces coefficient of variation to find out the sparse domain and dense domain samples, and then deals with them in different ways, an oversampling method(BSMOTE)is proposed to improve the SMOTE algorithm for the minority samples in sparse domain. An improved undersampling method(IS)is proposed for the majority samples in dense domain. Finally, experiments on six imbalanced datasets show that the algorithm achieves higher G-mean value, F-value value, AUC value, and improves the comprehen-sive performance of imbalanced datasets classification effectively.

关键词

不平衡数据集/变异系数/SMOTE算法/欠采样

Key words

imbalanced datasets/coefficient of variation/SMOTE algorithm/undersampling

分类

信息技术与安全科学

引用本文复制引用

张明,胡晓辉,吴嘉昕..基于混合采样的不平衡数据集算法研究[J].计算机工程与应用,2019,55(17):68-75,8.

基金项目

国家自然科学基金(No.61163009) (No.61163009)

甘肃省科技计划(No.144NKCA040). (No.144NKCA040)

计算机工程与应用

OA北大核心CSCDCSTPCD

1002-8331

访问量4
|
下载量0
段落导航相关论文