河南理工大学学报(自然科学版)2017,Vol.36Issue(5):95-99,105,6.DOI:10.16186/j.cnki.1673-9787.2017.05.016
一种面向不平衡数据的半监督特征选择算法
A semi-supervised feature selection algorithm for imbalanced data
摘要
Abstract
Considering the scarcity of labeled samples and the high feature dimension for imbalanced data,a new semi-supervised feature selection algorithm based on GA and Biased-SVM is proposed.The biased-SVM model which can dispose the unbalanced samples data is trained by the initial labeled sample set and then the trained Biased-SVM model is used to add labels to the unlabeled samples,and add the new labeled samples to the initial labeled sample set.Finally,the optimal feature subset is selected by the GA-based feature selection method for imbalanced data.Experimental results show that the proposed method not only reduces the feature dimension,but also improves the precision of the minor class under the different labeled sample rates generally.关键词
遗传算法/Biased-SVM/不平衡数据/半监督学习/特征选择Key words
genetic algorithm/Biased-SVM/imbalanced data/semi-supervised learn/feature selection分类
信息技术与安全科学引用本文复制引用
杜利敏,徐扬..一种面向不平衡数据的半监督特征选择算法[J].河南理工大学学报(自然科学版),2017,36(5):95-99,105,6.基金项目
国家自然科学基金青年科学基金资助项目(61305074) (61305074)