首页|期刊导航|计算机科学与探索|结合人工蜂群与K-means聚类的特征选择

结合人工蜂群与K-means聚类的特征选择

孙林刘梦含薛占熬

计算机科学与探索2024，Vol.18Issue(1)：93-110,18.

计算机科学与探索2024，Vol.18Issue(1)：93-110,18.DOI:10.3778/j.issn.1673-9418.2212075

结合人工蜂群与K-means聚类的特征选择

Feature Selection Combining Artificial Bee Colony with K-means Clustering

孙林 ¹刘梦含 ²薛占熬²

作者信息

1. 天津科技大学人工智能学院,天津 300457
2. 河南师范大学计算机与信息工程学院,河南新乡 453007
折叠

摘要

Abstract

K-means clustering is a simple and efficient,fast in convergence and easy to implement statistical analysis method.However,the traditional K-means clustering algorithm is sensitive to the selection of initial clustering centers and easy to fall into a local optimum,and at the same time,most unsupervised feature selection algorithms are easy to ignore the relationship between features.To solve the above issues,this paper proposes a feature selection algorithm combining artificial bee colony with K-means clustering.Firstly,to make the similarity of samples in the same cluster high and the similarity of the samples in different clusters low,a new fitness function is constructed based on the clustering degree within the cluster and the dispersion degree between the clusters,which can better reflect the characteristics of each sample,and then a new probability expression of the honey source being selected is constructed.Secondly,the weight which decreases gradually with the increase of the number of iterations is designed,and the honey source location update expression that makes the search range of the bee colony dynamically indent is proposed.Thirdly,to make up for the limitation of the traditional Euclidean distance which only considers the cumulative difference between vectors when calculating the distance,a weighted Euclidean distance expression which simultaneously considers both the different influence degrees of the samples and the similarity of the samples is constructed.Finally,the standard deviation and distance correlation coefficient are introduced to define feature discrimination and feature representativeness,and the product of them is used to measure the importance of features.Experimental results show that the proposed algorithm accelerates the convergence speed of artificial bee colony algorithm and improves the clustering effect of K-means algorithm,and also effectively improves the classification effect of feature selection.

关键词

特征选择/人工蜂群/K-means聚类/特征重要度

Key words

feature selection/artificial bee colony/K-means clustering/feature importance

分类

信息技术与安全科学

引用本文复制引用

孙林,刘梦含,薛占熬..结合人工蜂群与K-means聚类的特征选择[J].计算机科学与探索,2024,18(1):93-110,18.

基金项目

国家自然科学基金(62076089,61976082).This work was supported by the National Natural Science Foundation of China(62076089,61976082). （62076089,61976082）

计算机科学与探索

OA北大核心CSTPCD

ISSN：1673-9418

访问量0

下载量0

段落导航