计算机技术与发展2017,Vol.27Issue(9):31-34,4.DOI:10.3969/j.issn.1673-629X.2017.09.007
基于泛化能力的K-均值最佳聚类数确定方法
A Method for Determination of Optimal Value in K-means Clustering with Generalization
摘要
Abstract
Aimed at the defect of K-means clustering algorithm determining the clustering number in advance which could be defined arti-ficially and is subjective in computations, a method of determining an optimal clustering value with generalization is proposed. It is thought that a good clustering result should have good generalization to the unknown samples. Therefore,a generalization index is de-signed to evaluate the classification of the unknown samples in the clustering model obtained. The more the value of generalization index, the better the effect of clustering model. The K value corresponded by clustering model with optimal generalization is selected as the opti-mal clustering value. In order to verify its stability and effectiveness, the experiments are carried out in optimal clustering determining methods based on generalization based on Iris and artificial data set,which indicate that it is simple and efficient to obtain the optimal clustering number,and has the good clustering effect.关键词
K-均值/最佳聚类数/泛化能力/非监督学习Key words
K-means clustering/optimal number of clusters/generalization/unsupervised learning分类
信息技术与安全科学引用本文复制引用
张雄,赵礼峰..基于泛化能力的K-均值最佳聚类数确定方法[J].计算机技术与发展,2017,27(9):31-34,4.基金项目
国家自然科学基金青年基金项目(61304169) (61304169)