首页|期刊导航|计算机工程与应用|一种基于中心文档的KNN中文文本分类算法

一种基于中心文档的KNN中文文本分类算法

鲁婷王浩姚宏亮

计算机工程与应用2011，Vol.47Issue(2)：127-130,4.

计算机工程与应用2011，Vol.47Issue(2)：127-130,4.DOI:10.3778/j.issn.1002-8331.2011.02.040

一种基于中心文档的KNN中文文本分类算法

K-nearest neighbor Chinese text categorization algorithm based on center documents

鲁婷 ¹王浩 ¹姚宏亮¹

作者信息

1. 合肥工业大学,计算机与信息学院,合肥,230009
折叠

摘要

Abstract

In order to search or extract information in a special category from large data source,text automatic categorization has become a hot subject of research.KNN is an important method of text automatic categorization, it can deal with large data sets with more stability,but it faces with the problem of slow speed.Based on KNN classification,the semantic relation of feature items is introduced, and clustering to build center documents under it.This method reduces the number of documents which KNN should search,and increases the speed of classification.Simulation results show that the proposed algorithm improves the speed in the case of traditional classification precision.

关键词

中文文本分类/七最邻近/中心文档/语义相似度/聚类

Key words

Chinese text classification/ k-Nearest Neighbor(KNN)/ center documents/ semantic similarity/ clustering

分类

信息技术与安全科学

引用本文复制引用

鲁婷,王浩,姚宏亮..一种基于中心文档的KNN中文文本分类算法[J].计算机工程与应用,2011,47(2):127-130,4.

基金项目

国家自然科学基金(the National Natural Science Foundation of China under Grant No.60705015) （the National Natural Science Foundation of China under Grant No.60705015）

安徽省自然科学基金(the Natural Science Foundation of Anhui Province of China under Grant No.070412064) （the Natural Science Foundation of Anhui Province of China under Grant No.070412064）

合肥工业大学科学研究发展基金项目(No.070504F). （No.070504F）

计算机工程与应用

OACSCDCSTPCD

ISSN：1002-8331

访问量0

下载量0

段落导航