| 注册
首页|期刊导航|南京大学学报(自然科学版)|面向高维小样本数据的层次子空间ReliefF特征选择算法

面向高维小样本数据的层次子空间ReliefF特征选择算法

程凤伟 王文剑 张珍珍

南京大学学报(自然科学版)2023,Vol.59Issue(6):928-936,9.
南京大学学报(自然科学版)2023,Vol.59Issue(6):928-936,9.DOI:10.13232/j.cnki.jnju.2023.06.003

面向高维小样本数据的层次子空间ReliefF特征选择算法

Hierarchical subspace ReliefF feature selection algorithm for high-dimensional small sample data

程凤伟 1王文剑 2张珍珍1

作者信息

  • 1. 太原学院计算机科学与技术系,太原,030032
  • 2. 计算智能与中文信息处理教育部重点实验室,山西大学,太原,030006
  • 折叠

摘要

Abstract

High-dimensional small sample data has much higher feature dimensions than the number of samples,which usually contains a large number of redundant features.ReliefF algorithm has the following challenges when dealing with such data.Most of the existing improved ReliefF algorithms eliminate redundant features by calculating the mutual information between features,which is not applicable to high-dimensional data.Classifying by intercepting a number of features with the highest relevance to the label may not be the optimal choice because it does not consider the impact of different feature combinations on the classification performance.In this paper,we propose a ReliefF feature selection algorithm based on hierarchical subspaces,which divides the original feature set into subspaces with hierarchical structure and calculates the local dependencies of the lower subspaces by using the neighborhood rough set theory,which eliminates redundant features in batch with high efficiency on high-dimensional small sample data.In addition,in order to consider the influence of different feature combinations on the results,the concept of"local leadership"is introduced,and the features with stronger"leading"ability in some subspaces are retained to give a more objective evaluation of the features from both local and global perspectives.Experiments on six microarray gene datasets show that the proposed method is more efficient than existing methods and maintains good classification performance.

关键词

高维小样本数据/特征选择/ReliefF/层次子空间/邻域粗糙集

Key words

high-dimensional small sample data/feature selection/ReliefF/hierarchical subspace/neighborhood rough set

分类

计算机与自动化

引用本文复制引用

程凤伟,王文剑,张珍珍..面向高维小样本数据的层次子空间ReliefF特征选择算法[J].南京大学学报(自然科学版),2023,59(6):928-936,9.

基金项目

国家自然科学基金(62076154,U1805263),中央引导地方科技发展资金(YDZX20201400001224),山西省自然科学基金(201901D111030),山西省教育科学"十四五"规划项目(GH21395) (62076154,U1805263)

南京大学学报(自然科学版)

OACSCDCSTPCD

0469-5097

访问量0
|
下载量0
段落导航相关论文