计算机科学与探索2016,Vol.10Issue(6):875-883,9.DOI:10.3778/j.issn.1673-9418.1506050
基于q近邻的不完备数据三支决策聚类方法
Three-Way Decision Clustering Algorithm for Incomplete Data Based on q-Nearest Neighbors
摘要
Abstract
Clustering is a common technique for data analysis, and has been widely used in many practical areas. However, in many practical applications, there are some reasons to cause the missing values in real data sets such as difficulties and limitations of data acquisition and random noises. Most of clustering methods can’t be used to deal with incomplete data sets for clustering analysis directly. For this reason, this paper proposes a three-way decision clustering algorithm for incomplete data based on q-nearest neighbors. Firstly, the algorithm finds the q-nearest neighbors for an object with missing values, and the missing value is filled by the average value of q-nearest neighbors. Secondly, it uses the clustering method based on density peaks for the complete data set to obtain the clustering result. For the data object with uncertainty in each cluster, it is designed to the boundary region of a cluster using the three-way decision theory. The three-way decision with interval sets naturally partitions a cluster into three regions as the positive region, boundary region and negative region, which has the advantage of dealing with soft clustering. The experimental results on some UCI data sets and synthetic data sets show preliminarily the effectiveness of the proposed algorithm.关键词
不完备数据/三支决策聚类/q近邻Key words
incomplete data/three-way decision clustering/q-nearest neighbors分类
信息技术与安全科学引用本文复制引用
苏婷,于洪..基于q近邻的不完备数据三支决策聚类方法[J].计算机科学与探索,2016,10(6):875-883,9.基金项目
The National Natural Science Foundation of China under Grant Nos.61379114,61272060(国家自然科学基金) (国家自然科学基金)