摘要
Abstract
First, we point out the necessity of integrating multiple clustering results, and analyse the differences of multiple clustering results. The similarity of the categories between different clustering results defined in SAMARAH model is briefly introduced, and the shortcomings of existing research are revealed. Then we define the concepts of similarity, similarity matrix, transfer cost and closest neighbourhood, which are all involved in the algorithm, and give necessary. description on them, the corresponding theorem are put forward and proved as well. On this basis, we dwell on the steps of the algorithm of multiple clustering results in combination with the presented definitions and theorems,put the emphasis on the strategy. adjustment for four different situations corresponding to two categories with biggest similarity between two different clustering results. We take the clustering result of integrated fuzzy C-means and the clustering result of K-means as the example and illustrate that the proposed algorithm of integrating multiple clustering results is practical, at the end of the paper we draw up some conclusions of the algorithm.关键词
聚类/算法/相似度/相似度矩阵/转移代价/最近邻居Key words
Clustering/Algorithm/Similarity/Similarity matrix/Transfer cost/Closest neighbourhood