自动化学报2011,Vol.37Issue(12):1422-1434,13.DOI:10.3724/SP.J.1004.2011.01422
快速核密度估计定理和大规模图论松弛聚类方法
Fast Kernel Density Estimate Theorem and Scaling up Graph-based Relaxed Clustering Method
摘要
Abstract
In this paper, the fast kernel density estimate (FKDE) theorem is presented firstly, which points out that the integrated squared error between the Gaussian kernel based KDE of the whole dataset and the one of a sampled subset is related to the sample size and the kernel width, but not to the size of the whole dataset. Next, it is deduced that the objective function of graph-based relaxed clustering (GRC) algorithm based on Gaussian kernel can be represented as two parts: weight sum of Parzen window (PW) and "quadratic entropy", that is, GRC can also be viewed as a KDE problem. So the scaling up GRC by KDE approximation (SUGRC-KDEA) method is proposed according to the FKDE theorem. Compared with the previous work, the advantage of this method lies in that it provides an easier and more straightforward implementation for GRC on large datasets.关键词
核密度估计/大规模数据集/聚类/抽样子集Key words
Kernel density estimate (KDE)/large data set/ clustering/sampled subset引用本文复制引用
钱鹏江,王士同,邓赵红..快速核密度估计定理和大规模图论松弛聚类方法[J].自动化学报,2011,37(12):1422-1434,13.基金项目
国家自然科学基金(60903100,60975027)资助 (60903100,60975027)