计算机应用与软件2012,Vol.29Issue(1):171-174,229,5.
基于聚类分析的图模型文档分类
DOCUMENT CATEGORISATION USING GRAPH MODEL BASED ON CLUSTERING ANALYSIS
孟海东 1刘小荣1
作者信息
- 1. 内蒙古科技大学信息工程学院 内蒙古包头014010
- 折叠
摘要
Abstract
Directing at the problem in traditional vector space model that the feature items are dealt with in isolation, in this paper the feature reduction is firstly done through the model of χ\2 statistics in combination with feature clustering, and then the graph model is used to establish correlative information between the words. At the end, KNN method is utilised for document classification test. The algorithm improves the contribution of rare words to the classification, enhances the classification performance of conjunctive words and reduces the number of dimensions in document vectors. Experiment indicates that the algorithm improves the accuracy and recall rates of classification.关键词
聚类分析/图模型/文档分类Key words
Clustering analysis/Graph model/Document categorisation分类
信息技术与安全科学引用本文复制引用
孟海东,刘小荣..基于聚类分析的图模型文档分类[J].计算机应用与软件,2012,29(1):171-174,229,5.