华中科技大学学报(自然科学版)2024,Vol.52Issue(2):29-35,7.DOI:10.13245/j.hust.240202
基于持续同调的高维数据聚类算法
Persistent homology based clustering algorithm for high-dimensional data
摘要
Abstract
A new clustering algorithm called persistent homology based clustering(PHBC)was proposed for high-dimensional data.The data was processed through the perspective of topology,and the topological features of different types of samples were calculated by using the simple complex forms.The topological features were recorded as persistent homology information,and the persistent homology information was converted into vector form,which was used as the input of the clustering algorithm.In this way,the clustering algorithm can process high-dimensional data after preprocessing.Experimental results show that the PHBC algorithm can handle various complex high-dimensional data.Compared with several typical clustering algorithms,PHBC can improve the clustering performance considerably as evaluated by a variety of clustering metrics.Also,the standard deviation of these metrics is smaller,indicating a more stable output of PHBC.关键词
机器学习/聚类/高维数据/拓扑数据分析/持续同调Key words
machine learning/clustering/high-dimensional data/topological data analysis/persistent homology分类
信息技术与安全科学引用本文复制引用
熊正大,韦逸卓,熊子恒,何琨..基于持续同调的高维数据聚类算法[J].华中科技大学学报(自然科学版),2024,52(2):29-35,7.基金项目
国家自然科学基金资助项目(61772219). (61772219)