统计与决策2024,Vol.40Issue(8):11-16,6.DOI:10.13546/j.cnki.tjyjc.2024.08.002
基于密度峰值聚类的不平衡数据过抽样方法
Oversampling Method for Imbalanced Data Based on Density Peak Clustering
张智驹1
作者信息
- 1. 重庆航天职业技术学院 智能信息工程学院,重庆 400021
- 折叠
摘要
Abstract
Most oversampling methods for imbalanced data rely on too many parameters,which are easy to generate noise and difficult to handle manifold data sets.Therefore,this paper proposes an oversampling method for imbalanced data based on density peak clustering(OVMEDPC).First,OVMEDPC uses density peak clustering(DPC)to reveal the spatial structure of imbalanced data.Second,OVMEDPC designs a density peak clustering-based noise filter to remove noise.Finally,OVMEDPC designs an in-terpolation method to create synthetic minority class samples.Experiments show that in terms of random forest classifiers,OVMEDPC outperforms 5 advanced over-sampling methods on F-measure and G-mean.关键词
不平衡分类/过抽样方法/分类/密度峰值/聚类Key words
imbalanced classification/oversampling method/classification/density peak/clustering分类
信息技术与安全科学引用本文复制引用
张智驹..基于密度峰值聚类的不平衡数据过抽样方法[J].统计与决策,2024,40(8):11-16,6.