计算机工程2025,Vol.51Issue(5):196-205,10.DOI:10.19678/j.issn.1000-3428.0068963
受同轴对称抛物线约束的少数类样本合成方法
Synthesis Method for Minority Samples with Constraints of Coaxial Symmetric Parabolas
摘要
Abstract
Linear interpolation is often used by in over-sampling techniques to synthesize samples,but the its disadvantages include the lack of randomness in the sampling results and a tendency to increase the degree of class between samples of different categories,making it difficult to improve the classification ability for imbalanced sample sets.This paper proposes a generation method for minority samples with coaxial-symmetric parabolic constraints.First,for minority class samples,an adaptive weighting strategy based on the risk factor and similarity factor is established.The weight can determine the direction and range of sample synthesis during the sampling process.Then,a pair of coaxial symmetric parabolas based on minority class samples and corresponding sample weights is constructed,the closed region surrounded by a pair of coaxial symmetric parabolas is taken as the nonlinear synthesis region.Finally,when introducing a new sample,determine whether this sampling can effectively avoid invading the distribution areas of other categories of samples by observing the changes in the Bhattacharyya coefficient in the neighboring domain of the new sample,thereby improving the sampling quality.Comparison experiments on six public sample sets from the UCI show that when C4.5 is used as a classifier,the integrated oversampling method improves precision by 7.85 percentage points,recall by 2.87 percentage points,and G-means by 2.00 percentage points compared to the original sampling method.关键词
线性插值/抛物线/自适应加权/非线性合成区域/巴氏系数Key words
linear interpolation/parabolas/adaptive weighting/nonlinear synthesis region/Bhattacharyya coefficient分类
信息技术与安全科学引用本文复制引用
朱宸敏,余粟..受同轴对称抛物线约束的少数类样本合成方法[J].计算机工程,2025,51(5):196-205,10.基金项目
国家科技支撑计划(2015BAF10B00) (2015BAF10B00)
上海市科委科研计划(17511110204). (17511110204)