| 注册
首页|期刊导航|计算机应用与软件|一种新的Web中文文本聚类方法研究

一种新的Web中文文本聚类方法研究

叶宇飞 安世全 代劲

计算机应用与软件Issue(12):222-225,287,5.
计算机应用与软件Issue(12):222-225,287,5.DOI:10.3969/j.issn.1000-386x.2013.12.058

一种新的Web中文文本聚类方法研究

RESEARCH ON A NOVEL WEB CHINESE TEXT CLUSTERING METHOD

叶宇飞 1安世全 2代劲3

作者信息

  • 1. 重庆邮电大学计算机科学与技术学院 重庆400065
  • 2. 重庆邮电大学移通学院 重庆400065
  • 3. 重庆大学计算机科学与技术学院 重庆400065
  • 折叠

摘要

Abstract

Traditional text clustering lacks the semantic information , its text eigenvector is high-dimension sparse , and ignores the particularity of the Web text .In order to solve these problems , we propose a Web Chinese text clustering method in this paper .On the basis HowNet-base concept space , the method filters the terms but nouns , analyses the semantics of the important words in the text , and carry out the feature set clustering on label feature set and text feature set .Then it uses the improved TF-IDF algorithm to select features from these two sets, and finally expresses the text as a union of the selected label feature set and text feature set .It reduces the dimensions of features , and expresses the text efficiently .Experimental results demonstrate its effectiveness .

关键词

Web文本聚类/特征降维/知网/文本相似度

Key words

Web text clustering/Feature dimension reduction/HowNet/Text similarity

分类

信息技术与安全科学

引用本文复制引用

叶宇飞,安世全,代劲..一种新的Web中文文本聚类方法研究[J].计算机应用与软件,2013,(12):222-225,287,5.

计算机应用与软件

OA北大核心CSCDCSTPCD

1000-386X

访问量0
|
下载量0
段落导航相关论文