电子学报2009,Vol.37Issue(11):2489-2495,7.
一种基于核SMOTE的非平衡数据集分类方法
A Classfication Method For Imbalance Data Set Based on Kernel SMOTE
摘要
Abstract
An approach based on kernel SMOTE (Synthetic Minority Over-sampling Technique) to solve classification on imbalance data set by Support Vector Machine (SVM) is presented. The method first oversamples the minority class in feature space by kernel SMOTE algorithm, then the pre-images of the synthetic instances are found based on a distance relation between feature space and input space. Finally, these pre-images are appended to the original data set to train a SVM. Experiments on real data sets indicate that compared with SMOTE approach, the samples constructed by the kernel SMOTE algorithm have the higher quality.As a result, the effectiveness of classification by SVM on imbalance data set is unproved.关键词
非平衡数据集/支持向量机/输入空间/特征空间/原像Key words
imbalance data set/ support vector machine/ input space/ feature space /pre-image分类
信息技术与安全科学引用本文复制引用
曾志强,吴群,廖备水,高济..一种基于核SMOTE的非平衡数据集分类方法[J].电子学报,2009,37(11):2489-2495,7.基金项目
国家自然科学基金项目(No.60773177) (No.60773177)
福建省青年人才项目(No.2008F3108) (No.2008F3108)