| 注册
首页|期刊导航|计算机工程与应用|高维分类型数据加权子空间聚类算法

高维分类型数据加权子空间聚类算法

孙浩军 闪光辉 高玉龙 袁婷 吴云霞

计算机工程与应用Issue(23):131-135,202,6.
计算机工程与应用Issue(23):131-135,202,6.DOI:10.3778/j.issn.1002-8331.1301-0121

高维分类型数据加权子空间聚类算法

Algorithm for high-dimensional categorical data weighted sub-space clustering

孙浩军 1闪光辉 1高玉龙 1袁婷 1吴云霞1

作者信息

  • 1. 汕头大学 工学院,广东 汕头 515063
  • 折叠

摘要

Abstract

Subspace clustering is a kind of effective strategy to high-dimensional data clustering, the principle of sub-space clustering is as well as possible keeping original data information, meanwhile as small as possible using subspace to data clustering. Based on the studying of the existing soft subspace clustering, it proposes a new algorithm for subspace searching. The algorithm combines with the size of cluster and information entropy, defines a new subspace dimensional weight distribution mode, and then uses the feature vector of cluster subspace to measure the similarity of two clusters. It uses the idea of agglomerative hierarchical clustering in hierarchical clustering to data clustering, which overcoming the shortcomings of using information entropy or traditional similarity separately. Through the test in the Zoo, Votes, Soybean three typical categorical data set to find out that compared with other algorithms, the proposed algorithm not only can improve the accuracy of clustering, but also has the very high stability.

关键词

高维数据/聚类/子空间/信息熵/层次聚类

Key words

high-dimensional data/clustering/subspace/information entropy/hierarchical clustering

分类

信息技术与安全科学

引用本文复制引用

孙浩军,闪光辉,高玉龙,袁婷,吴云霞..高维分类型数据加权子空间聚类算法[J].计算机工程与应用,2014,(23):131-135,202,6.

基金项目

国家自然科学基金(No.61170130)。 ()

计算机工程与应用

OA北大核心CSCDCSTPCD

1002-8331

访问量0
|
下载量0
段落导航相关论文