统计与决策2025,Vol.41Issue(4):51-56,6.DOI:10.13546/j.cnki.tjyjc.2025.04.009
面向簇间密度不均匀数据的密度峰值聚类优化算法
Density Peak Clustering Optimization Algorithm for Data With Strong Density Disparity Among Clusters
摘要
Abstract
In order to address the problem that the density peak clustering algorithm is not ideal for the datasets with strong disparity in inter-cluster densities,this paper optimizes the clustering algorithm from two perspectives:the selection of class clus-ter centres and the allocation of the remaining samples,and proposes a density peak clustering optimization algorithm for data with strong density disparity among clusters.Firstly,the K-nearest neighbor idea is introduced in defining the local density of sample points,which takes the spatial distribution characteristics of samples into consideration,and reduces the density gap between sam-ples derived from datasets with large disparities within density distribution between clusters,so as to improve the accuracy of clus-ter center selection.And then,the information of K-nearest neighbor,the natural nearest neighbor and the shared nearest neighbor is mixed-weighted to construct the weighted similarity of samples,which strengthens the correlation between the points in the cor-responding class of clusters,so as to achieve the correct allocation of the remaining samples in datasets with large disparities in density distribution among clusters.Comparison experiments between this algorithm,DPC,K-means,and DBSCAN algorithm show that the proposed method obtains better clustering effect on datasets with substantial disparity in density distribution among clusters.关键词
密度峰值聚类/K近邻/混合近邻/样本相似性Key words
density peak clustering/K-nearest neighbor/mixed nearest neighbor/sample similarity分类
信息技术与安全科学引用本文复制引用
胡文琪,李永博,王晓彤,孟路稳..面向簇间密度不均匀数据的密度峰值聚类优化算法[J].统计与决策,2025,41(4):51-56,6.基金项目
国家自然科学基金青年科学基金项目(51809274) (51809274)
国家社会科学基金重大项目(19ZDA100) (19ZDA100)