首页|期刊导航|计算机技术与发展|基于核密度估计的K-means聚类优化

基于核密度估计的K-means聚类优化

熊开玲彭俊杰杨晓飞黄俊

计算机技术与发展2017，Vol.27Issue(2)：1-5,5.

计算机技术与发展2017，Vol.27Issue(2)：1-5,5.DOI:10.3969/j.issn.1673-629X.2017.02.001

基于核密度估计的K-means聚类优化

K-means Clustering Optimization Based on Kernel Density Estimation

熊开玲 ¹彭俊杰 ¹杨晓飞 ²黄俊²

作者信息

1. 上海大学计算机工程与科学学院,上海200444
2. 中国科学院上海高等研究院公共安全中心,上海201210
折叠

摘要

Abstract

K-means clustering algorithm is classical and widely used in many fields,but it has poor performance in the case of processing high dimensional and large data sets.Kernel density estimation is a nonparametric estimation method to estimate the density function of unknown distribution,which can effectively obtain the distribution of the data set.Sampling is a common method fordata mining in large data sets.Density biased sampling is an improved method for the problem of easy loss of important information when using the simple random sampling in the inclined date set.A method is proposed using result of kernel density estimation,which chooses sample points from neighborhood of peak of density function of dataset as the initial center parameters of K-means and uses result of kernel density estimation to perform density biased sampling on the dataset,then runs K-means clustering on the sample set.The experimental results show that using the kernel density estimation for selection of initial parameters and density bias sample can effectively accelerate the K-means clustering process.

关键词

K-means聚类/密度偏差抽样/核密度估计/数据挖掘

Key words

K-means clustering/density bias sampling/kernel density estimation/data mining

分类

信息技术与安全科学

引用本文复制引用

熊开玲,彭俊杰,杨晓飞,黄俊..基于核密度估计的K-means聚类优化[J].计算机技术与发展,2017,27(2):1-5,5.

基金项目

国家自然科学基金资助项目(61201446) （61201446）

计算机技术与发展

OACSTPCD

ISSN：1673-629X

访问量0

下载量0

段落导航