|国家科技期刊平台
首页|期刊导航|华中科技大学学报(自然科学版)|基于持续同调的高维数据聚类算法

基于持续同调的高维数据聚类算法OA北大核心CSTPCD

Persistent homology based clustering algorithm for high-dimensional data

中文摘要英文摘要

针对复杂的高维数据,提出了基于持续同调的聚类(PHBC)算法.该算法从拓扑学的角度处理数据,通过使用单纯复形来计算不同种类样本的拓扑特征,将拓扑特征转化为持续同调信息,再将持续同调信息转化为向量形式作为聚类算法的输入,使得传统聚类算法能够处理高维度的数据.实验结果表明:PHBC算法能够应对并处理多种复杂的高维数据,与多个经典的聚类方法相比,在多种聚类指标上均有一定程度的提升,并且指标的标准差更小,即聚类结果的稳定性更高.

A new clustering algorithm called persistent homology based clustering(PHBC)was proposed for high-dimensional data.The data was processed through the perspective of topology,and the topological features of different types of samples were calculated by using the simple complex forms.The topological features were recorded as persistent homology information,and the persistent homology information was converted into vector form,which was used as the input of the clustering algorithm.In this way,the clustering algorithm can process high-dimensional data after preprocessing.Experimental results show that the PHBC algorithm can handle various complex high-dimensional data.Compared with several typical clustering algorithms,PHBC can improve the clustering performance considerably as evaluated by a variety of clustering metrics.Also,the standard deviation of these metrics is smaller,indicating a more stable output of PHBC.

熊正大;韦逸卓;熊子恒;何琨

华中科技大学计算机科学与技术学院,湖北 武汉 430074武汉科技大学香涛学院,湖北 武汉 430081

计算机与自动化

机器学习聚类高维数据拓扑数据分析持续同调

machine learningclusteringhigh-dimensional datatopological data analysispersistent homology

《华中科技大学学报(自然科学版)》 2024 (002)

29-35 / 7

国家自然科学基金资助项目(61772219).

10.13245/j.hust.240202

评论