高技术通讯(英文版)2007,Vol.13Issue(2):131-135,5.
Scalable classification by clustering: Hybrid can be better than Pure
Scalable classification by clustering: Hybrid can be better than Pure
摘要
Abstract
The problem of scalable classification by clustering in large databases was discussed. Clustering based classification method first generates clusters using clustering algorithms . To classify new coming data points , it finds the k nearest clusters of the data point as neighbors , and assign each data point to the dominant class of these neighbors . Existing algorithms incorporated class information in making clustering decisions and produced pure clusters (each cluster associated with only one class) . We presented hybrid cluster based algorithms , which produce clusters by unsupervised clustering and allow each cluster associated with multiple classes . Experimental results show that hybrid cluster based algorithms outperform pure ones in both classification accuracy and training speed.关键词
classification/clustering/data miningKey words
classification/clustering/data mining分类
化学化工引用本文复制引用
Deng Shengchun,He Zengyou ,Xu Xiaofei..Scalable classification by clustering: Hybrid can be better than Pure[J].高技术通讯(英文版),2007,13(2):131-135,5.基金项目
Supported by the High Technology Research and Development Programme of China (No.2Q02AA413310) and the IBM SUR Research Fund. (No.2Q02AA413310)