广东工业大学学报2017,Vol.34Issue(3):49-53,58,6.DOI:10.12052/gdutxb.170011
一种改进的mpts-HDBSCAN算法
An Improved mpts-HDBSCAN Algorithm
摘要
Abstract
Cluster analysis is an important branch of non-supervised model classification, DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is one of the most common algorithms in density-based clustering methods. It's widely researched and applied in many fields as it can find clusters of arbitrary shapes with noises. Some shortcomings of DBSCAN and also recently improved algorithms based on DBSCAN are focused on. A new data partitioning method is proposed to solve the problem that m pts-HDBSCAN clustering quality will degrade when applied in varied density dataset. Firstly the proposed partitioning method calculates the numbers of the group based on the histogram of the data distribution. Secondly it is determined whether to partition the dataset based on the threshold value. Sub-datasets generated by partitioning method will bind with m pts-HDBSCAN to find clusters and finally merge the sub-clusters to one. Experiment shows the proposed binding algorithm is more effective than m pts-HDBSCAN in finding clusters when dataset density is not even.关键词
聚类/数据分区/mpts-HDBSCAN算法/合并子类Key words
clustering/data partitioning/mpts-HDBSCAN/merging sub clusters分类
信息技术与安全科学引用本文复制引用
王荣荣,傅秀芬..一种改进的mpts-HDBSCAN算法[J].广东工业大学学报,2017,34(3):49-53,58,6.基金项目
广东省科技计划项目(2013B010401034) (2013B010401034)