| 注册
首页|期刊导航|南京大学学报(自然科学版)|基于分位数半径的动态K-means算法

基于分位数半径的动态K-means算法

程明畅 刘友波 张程嘉 马铁丰

南京大学学报(自然科学版)2018,Vol.54Issue(1):48-55,8.
南京大学学报(自然科学版)2018,Vol.54Issue(1):48-55,8.DOI:10.13232/j.cnki.jnju.2018.01.006

基于分位数半径的动态K-means算法

Dynamic K-means algorithm based on quantile radius

程明畅 1刘友波 2张程嘉 2马铁丰1

作者信息

  • 1. 西南财经大学统计学院,成都,611130
  • 2. 四川大学电气信息学院,成都,610065
  • 折叠

摘要

Abstract

There are obvious deficiencies in the well-known K-means algorithm:one is that it is sensitive to initial values,and the other is that the number of clusters needs to be given artificially.Many related works have improved the performance of K-means,but most of them only focus on single aspect rather than both on speed and identification of the number of clusters.The Hierarchical K-means algorithm(H K-means)was proposed to generate a center set C by carrying out K-means repeatedly with a constant k .After that,it generates the initial cluster centers of K-means by executing Hierarchical clustering algorithm on the center set C.This paper focuses on the geometric features of clusters,proposing a dynamic K-means algorithm based on quantile radius (QRD K-means ).This algorithm is on the basis of Hierarchical K-means and repeats K-means with varying k values.After executing K-means multiple times,a series of cluster centers are obtained to be stored in set C.Moreover,QRD K-means addi-tionally a new concept called quantile radius,which is a certain quantile of distance between the center and the sample points which belong to the corresponding cluster.Then it merges the centers in set C into a new cluster by comparing the pair-wise distance of centers with the sum of the corresponding quantile radiuses.In this way,a clustering result of centers can be obtained adaptively.At last,QRD K-means uses the centers of these new clusters as the initial centers of K-means to get the finial clustering result.The change in k value makes the center set C cover more areas than the previous Hierarchical K-means and can lead to more representative initial cluster centers.The way to get initial cluster centers by means of quantile radius is also faster.In the simulation experiments,QRD K-means was compared with Hierarchical K-means,DP-means and X-means both on the time consumption and accuracy.Results show that QRD K-means method is faster and more efficient.

关键词

K-means/类的数目/分位数半径/动态K-means

Key words

K-means/number of clusters/quantile radius/dynamic K-means

分类

信息技术与安全科学

引用本文复制引用

程明畅,刘友波,张程嘉,马铁丰..基于分位数半径的动态K-means算法[J].南京大学学报(自然科学版),2018,54(1):48-55,8.

基金项目

国家自然科学基金(11471264,11401148,11571282,51437003) (11471264,11401148,11571282,51437003)

南京大学学报(自然科学版)

OACSCDCSTPCD

0469-5097

访问量4
|
下载量0
段落导航相关论文