自适应密度峰值聚类算法OA北大核心CSTPCD
Adaptivedensity peak clustering algorithm
密度峰值聚类(DPC)以其简单、高效的特点被广泛应用.然而,其有两个不足:① 集群密度不均匀和不平衡的数据集在DPC所提供的决策图中,很难识别真正的聚类中心;② 存在一个区域密度最高的点的错误分配将导致该区域内的所有点都指向同一个错误的聚类的"链式效应".针对这两个不足,引入新的自然邻域(NaN)的概念,提出了一种基于自然邻域的密度峰值聚类算法(DPC-NaN).算法使用新的自然邻域密度识别噪声点,选择初始预聚类中心点,将非噪声点按密度峰值方法进行分配以得到预聚类;并通过确定预聚类的边界点和合并半径,自适应地将预聚类结果合并为最终聚类.所提算法无需人工预设参数,也缓解了"链式效应"的问题.实验结果表明,与相关聚类算法相比,所提出的算法可在典型的数据集上获得更好的聚类结果,同时在图像分割表现良好.
Density Peak Clustering(DPC)is widely used in many fields because of its simplicity and high efficiency.However,it has two disadvantages:① It is difficult to identify the real clustering center in the decision graph provided by DPC for data sets with an uneven cluster density and imbalance;② There exists a"chain effect"where a misallocation of the points with the highest density in a region will result in all points within the region pointing to the same false cluster.In view of these two deficiencies,a new concept of Natural Neighbor(NaN)is introduced,and a density peak clustering algorithm based on the natural neighbor(DPC-NaN)is proposed which uses the new natural neighborhood density to identify the noise points,selects the initial preclustering center point,and allocates the non-noise points according to the density peak method to get the preclustering.By determining the boundary points and merging radius of the preclustering,the results of the preclustering can be adaptively merged into the final clustering.The proposed algorithm eliminates the need for manual parameter presetting and alleviates the problem of"chain effect".Experimental results show that compared with the correlation clustering algorithm,the proposed algorithm can obtain better clustering results on typical data sets and perform well in image segmentation.
张强;周水生;张颖
西安电子科技大学 数学与统计学院,陕西 西安 710071
计算机与自动化
聚类密度峰值聚类自然邻域图像分割
clusteringdensity peak clusteringnatural neighborimage segmentation
《西安电子科技大学学报(自然科学版)》 2024 (002)
170-181 / 12
国家自然科学基金(61772020)
评论