| 注册
首页|期刊导航|计算机科学与探索|SMOTE类算法研究综述

SMOTE类算法研究综述

王晓霞 李雷孝 林浩

计算机科学与探索2024,Vol.18Issue(5):1135-1159,25.
计算机科学与探索2024,Vol.18Issue(5):1135-1159,25.DOI:10.3778/j.issn.1673-9418.2309079

SMOTE类算法研究综述

Survey of Research on SMOTE Type Algorithms

王晓霞 1李雷孝 2林浩3

作者信息

  • 1. 内蒙古工业大学 数据科学与应用学院,呼和浩特 010080
  • 2. 内蒙古工业大学 数据科学与应用学院,呼和浩特 010080||内蒙古自治区基于大数据的软件服务工程技术研究中心,呼和浩特 010080
  • 3. 天津理工大学 计算机科学与工程学院,天津 300384
  • 折叠

摘要

Abstract

Synthetic minority oversampling technique(SMOTE)has become one of the mainstream methods for dealing with unbalanced data due to its ability to effectively deal with minority samples,and many SMOTE im-provement algorithms have been proposed,but very little research existing considers popular algorithmic-level im-provement methods.Therefore a more comprehensive analysis of existing SMOTE class algorithms is provided.Firstly,the basic principles of the SMOTE method are elaborated in detail,and then the SMOTE class algorithms are systematically analyzed mainly from the two levels of data level and algorithmic level,and the new ideas of the hybrid improvement of data level and algorithmic level are introduced.Data-level improvement is to balance the data distribution by deleting or adding data through different operations during preprocessing;algorithmic-level improve-ment will not change the data distribution,and mainly strengthens the focus on minority samples by modifying or creating algorithms.Comparison between these two kinds of methods shows that,data-level methods are less re-stricted in their application,and algorithmic-level improvements generally have higher algorithmic robustness.In order to provide more comprehensive basic research material on SMOTE class algorithms,this paper finally lists the com-monly used datasets,evaluation metrics,and gives ideas of research in the future to better cope with unbalanced data problem.

关键词

不平衡数据/合成少数类过采样技术(SMOTE)/过采样/监督学习

Key words

unbalanced data/synthetic minority oversampling technique(SMOTE)/oversampling/supervised learning

分类

信息技术与安全科学

引用本文复制引用

王晓霞,李雷孝,林浩..SMOTE类算法研究综述[J].计算机科学与探索,2024,18(5):1135-1159,25.

基金项目

国家自然科学基金(62362055) (62362055)

内蒙古自治区重点研发与成果转化计划项目(2022YFSJ0013,2023YFHH0052) (2022YFSJ0013,2023YFHH0052)

内蒙古自治区高等学校青年科技英才支持计划项目(NJYT22084) (NJYT22084)

内蒙古自然科学基金(2023MS06008) (2023MS06008)

内蒙古自治区科技成果转化专项资金项目(2020CG0073,2021CG0033). This work was supported by the National Natural Science Foundation of China(62362055),the Key Research and Development and Achievement Transformation Program of Inner Mongolia Autonomous Region(2022YFSJ0013,2023YFHH0052),the Support Pro-gram for Young Scientific and Technological Talents in Higher Education Institutions in Inner Mongolia Autonomous Region(NJYT22084),the Natural Science Foundation of Inner Mongolia(2023MS06008),and the Special Funds for Transformation of Scien-tific and Technological Achievements of Inner Mongolia Autonomous Region(2020CG0073,2021CG0033). (2020CG0073,2021CG0033)

计算机科学与探索

OA北大核心CSTPCD

1673-9418

访问量0
|
下载量0
段落导航相关论文