| 注册
首页|期刊导航|南京大学学报(自然科学版)|基于改进局部密度的可扩展层次聚类算法

基于改进局部密度的可扩展层次聚类算法

陈斌 谢文波 付勋 张恒基 王欣

南京大学学报(自然科学版)2024,Vol.60Issue(3):370-382,13.
南京大学学报(自然科学版)2024,Vol.60Issue(3):370-382,13.DOI:10.13232/j.cnki.jnju.2024.03.002

基于改进局部密度的可扩展层次聚类算法

Density-based scalable hierarchical clustering

陈斌 1谢文波 1付勋 1张恒基 1王欣1

作者信息

  • 1. 西南石油大学计算机科学学院,成都,610500
  • 折叠

摘要

Abstract

Hierarchical clustering is an important research area in unsupervised learning.Due to its good interpretability,it is widely used in data mining.Most hierarchical clustering algorithms merge the clusters by calculating pairwise distances.Unfortunately,this step has high complexity(in both time and space),making it inapplicable in large-scale datasets.This paper proposes a density-based scalable hierarchical clustering algorithm(DBSC).Firstly,the algorithm constructs the nearest neighbor graph based on the nearest neighbor relationship of the data.Then,it selects the representative roots on each nearest neighbor component.The representative roots are selected based on the local density of the reciprocal nearest neighbor nodes.Besides,to reduce the interference of the isolated nearest neighbor component with selecting representative roots,the algorithm reconnects to the appropriate nearest neighbor component by second nearest neighbors.Through the above measures,the algorithm iteratively selects the representative roots and constructs the clustering tree in a bottom-up manner.Experiments on massive real datasets show that the algorithm increases the ability to process data to hundreds of thousands of items while ensuring the accuracy of clustering and fast response.

关键词

层次聚类/局部密度/最近邻图/互惠最近邻

Key words

hierarchical clustering/local density peak/nearest neighbor graph/reciprocal nearest neighbor

分类

信息技术与安全科学

引用本文复制引用

陈斌,谢文波,付勋,张恒基,王欣..基于改进局部密度的可扩展层次聚类算法[J].南京大学学报(自然科学版),2024,60(3):370-382,13.

基金项目

四川省成都市西南石油大学青年学者发展基金(202199010142),四川省科技创新人才基金(2022JDRC0009),西南石油大学自然科学"启航计划"(2023QHZ010),四川省自然科学基金(2024NSFSC1464) (202199010142)

南京大学学报(自然科学版)

OA北大核心CSTPCD

0469-5097

访问量0
|
下载量0
段落导航相关论文