华中师范大学学报(自然科学版)2023,Vol.57Issue(6):777-785,9.DOI:10.19603/j.cnki.1000-1190.2023.06.002
基于最大信息系数的关联性特征选择算法:MICCFS
MICCFS:a correlation-based feature selection algorithm based on maximum information coefficient
摘要
Abstract
To solve the problem that the correlation-based feature selection algorithm(CFS)can only recognize the linear relationship of variables for regression tasks and symmetrical uncertainty for classification tasks,a CFS feature selection algorithm based on maximum information coefficient(MIC)(named as MICCFS)is presented.It can replace the linear correlation coefficient between variables and symmetrical uncertainty in the classification task with MIC measure.The feature subset is searched by the best-first search algorithm.We conduct experiments to compare the results of MICCFS,CFS and other commonly used feature selection methods SVMRFE,Lasso,MIM,ReliefF,Chi-Square on eleven real-world datasets for regression and ten datasets for classification from UCI machine learning repository with using support vector machine(SVM),k-nearest neighbor algorithm(k-NN),naive bayes model(NB)and decision tree classifier(DT).The results show that MICCFS is superior to others.关键词
关联性特征选择/最大信息系数/特征选择/分类/降维Key words
correlation-based feature selection/maximum information coefficient/feature selection/classification/dimension reduction分类
数学引用本文复制引用
罗幼喜,谢昆明,胡超竹,李翰芳..基于最大信息系数的关联性特征选择算法:MICCFS[J].华中师范大学学报(自然科学版),2023,57(6):777-785,9.基金项目
国家自然科学基金青年项目(11701161) (11701161)
教育部人文社会科学基金项目(17YJA790098) (17YJA790098)
湖北省教育厅人文社会科学重点项目(20D043) (20D043)
湖北工业大学博士启动基金项目(BSQD2020103). (BSQD2020103)