信息与控制2024,Vol.53Issue(2):182-190,9.DOI:10.13976/j.cnki.xk.2024.3110
基于生成对抗网络的高斯型数据的过采样算法
Oversampling Algorithms for Gaussian-type Data Based on Generative Adversarial Networks
陈刚 1侯宾杰1
作者信息
- 1. 大连海事大学理学院,辽宁大连 116026
- 折叠
摘要
Abstract
To solve the problem of reduced classification effectiveness due to the tendency to favor some classes in unbalanced data classification,we propose a Monte Carlo oversampling algorithm based on generative adversarial networks(GANs).First,we simulate the probability density function of the minority class data using GANs and determine the oversampling weights of the minority class data using the probability density values of the minority class data.Second,to ensure the diversity of the generated data,we use a Monte Carlo algorithm to oversample a few classes of data.Simul-taneously,to avoid crossover and overlapping with the majority class,we introduce the 3σ rule to flip the data of the minority class into the 3 σ interval of the majority class,which balances the dataset.Finally,we select seven datasets from the UCI and KEEL databases for algorithm experi-ments and use the decision tree classifier as the base classifier to classify the data.The experimen-tal results show that the proposed algorithm is more effective than the comparison algorithms.关键词
非平衡数据/生成对抗网络/蒙特卡洛算法/概率密度函数/过采样算法Key words
imbalanced data/generative adversarial network(GAN)/Monte Carlo algorithm/probability density function/oversampling algorithm分类
信息技术与安全科学引用本文复制引用
陈刚,侯宾杰..基于生成对抗网络的高斯型数据的过采样算法[J].信息与控制,2024,53(2):182-190,9.