| 注册
首页|期刊导航|计算机技术与发展|基于Pareto支配策略的距离度量特征选择算法

基于Pareto支配策略的距离度量特征选择算法

罗雅欣 潘晓英 梁家铭 李航凯 王燕

计算机技术与发展2025,Vol.35Issue(5):82-89,8.
计算机技术与发展2025,Vol.35Issue(5):82-89,8.DOI:10.20165/j.cnki.ISSN1673-629X.2025.0007

基于Pareto支配策略的距离度量特征选择算法

Distance Metric Feature Selection Algorithm Based on Pareto Dominance Theory

罗雅欣 1潘晓英 2梁家铭 3李航凯 2王燕2

作者信息

  • 1. 西安邮电大学计算机学院,陕西西安 710121
  • 2. 西安邮电大学计算机学院,陕西西安 710121||西安邮电大学 陕西省网络数据分析与智能处理重点实验室,陕西 西安 710121
  • 3. 西安邮电大学计算机学院,陕西西安 710121||西安邮电大学理学院,陕西 西安 710121
  • 折叠

摘要

Abstract

A distance metric feature selection algorithm(DMPD)based on Pareto dominance strategy is proposed to address the imbalanced characteristics of high-dimensional small sample data and the problems of long training time and low performance in constructing prediction models.The algorithm aims to optimize the construction process of prediction models and effectively reduce com-putational costs.Firstly,it evaluates the correlation between each feature and category using the Fisher Score algorithm,and sorts them by score;On this basis,based on the screening results of different feature dimensions,the best feature dimension is selected through classification performance to complete feature pre selection.Secondly,cosine similarity is used to measure the similarity between features,and the Pareto dominance theory is reasonably applied to remove other features dominated by the highest category correlation feature one by one,effectively removing redundant features and obtaining a streamlined and efficient feature subset.The experimental results show that on six different datasets,the DMPD not only significantly improves classification performance under the same feature set dimension condition,but also performs better than using only Fisher Score or MIC(Mutual Information Coefficient)algorithms.Moreover,compared with FCBF-MIC(Fast Correlation Feature Selection Mutual Information Coefficient)algorithm,DMPD not only improves computational efficiency,but also achieves better classification ability in smaller feature dimensions,proving its effectiveness in solving the problem of imbalanced high-dimensional small sample data.

关键词

高维小样本/特征选择/距离度量/帕累托支配理论/最大相关最小冗余

Key words

high-dimensional small sample/feature selection/distance metric/Pareto dominance theory/max-relevance and min-redun-dancy

分类

信息技术与安全科学

引用本文复制引用

罗雅欣,潘晓英,梁家铭,李航凯,王燕..基于Pareto支配策略的距离度量特征选择算法[J].计算机技术与发展,2025,35(5):82-89,8.

基金项目

陕西省重点研发计划资助项目(2023-YBSF-476) (2023-YBSF-476)

计算机技术与发展

1673-629X

访问量0
|
下载量0
段落导航相关论文