| 注册
首页|期刊导航|河北工业科技|基于ADASYN和WGAN的混合不平衡数据处理方法

基于ADASYN和WGAN的混合不平衡数据处理方法

周万珍 盛媛媛 张永强 马金龙

河北工业科技2024,Vol.41Issue(4):291-298,8.
河北工业科技2024,Vol.41Issue(4):291-298,8.DOI:10.7535/hbgykj.2024yx04007

基于ADASYN和WGAN的混合不平衡数据处理方法

Hybrid imbalanced data processing based on ADASYN and WGAN

周万珍 1盛媛媛 2张永强 1马金龙1

作者信息

  • 1. 河北科技大学信息科学与工程学院,河北石家庄 050018||河北省智能物联网技术创新中心,河北石家庄 050018
  • 2. 河北科技大学信息科学与工程学院,河北石家庄 050018
  • 折叠

摘要

Abstract

In order to solve the problem of low classification accuracy of minority class samples in imbalanced datasets,an ADASYN-WGAN method was proposed to deal with imbalanced datasets.Firstly,the minority class samples were generated using the ADASYN algorithm,and these generated samples were used to replace the random noise in the WGAN;Secondly,the minority class samples conforming to the distribution law of the original dataset were generated using the WGAN algorithm to construct the balanced dataset;Then,the processing results derived from the proposed method and the four over-sampling algorithms were compared with the original dataset using the random forest classifier on six public datasets,respectively.Finally,the effectiveness of the proposed method was verified by the performance of classification assessment indexes such as F1-Score,G-mean and AUC.The results show that in the comparison experiments,the balanced dataset obtained by the ADASYN-WGAN method achieves the optimal values of all classification assessment indexes in four public datasets in the ten-fold cross-validation of the random forest classifier,and the F1-Score and G-mean achieve the highest values in the other two public datasets,although the AUC values are slightly lower.The proposed ADASYN-WGAN method can generate high-quality data samples and provide reference for solving the problem of prediction bias for a few class samples in unbalanced datasets.

关键词

数据处理/不平衡数据/WGAN/ADASYN/过采样方法/随机森林

Key words

data processing/imbalanced data/WGAN/ADASYN/oversampling method/random forest

分类

信息技术与安全科学

引用本文复制引用

周万珍,盛媛媛,张永强,马金龙..基于ADASYN和WGAN的混合不平衡数据处理方法[J].河北工业科技,2024,41(4):291-298,8.

基金项目

河北省自然科学基金(F2022208002) (F2022208002)

河北省高等学校科学技术研究重点项目(ZD2021048) (ZD2021048)

河北工业科技

OACSTPCD

1008-1534

访问量4
|
下载量0
段落导航相关论文