| 注册
首页|期刊导航|计算机工程与科学|基于词间关系分析的文本特征选择算法

基于词间关系分析的文本特征选择算法

吴双 张文生 徐海瑞

计算机工程与科学2012,Vol.34Issue(6):140-145,6.
计算机工程与科学2012,Vol.34Issue(6):140-145,6.DOI:10.3969/j.issn.1007-130X.2012.06.028

基于词间关系分析的文本特征选择算法

A Text Feature Selection Algorithm Based on Analysing the Relationship Between Words

吴双 1张文生 1徐海瑞1

作者信息

  • 1. 中国科学院自动化研究所,北京100190
  • 折叠

摘要

Abstract

The traditional feature selection algorithms usually select features distinguishing the different types of documents by the evaluation functions. However, these methods take the separate word as unit to establish a vector space model. The important words in the documents and the relationship between words are not realized. In allusion to the disadvantages mentioned above, a new feature selection algorithm based on the relationship between words is presented. This algorithm considers key words, mines words' association and checks these association rules by a correlation analysis to produce a feature space which closely relates to the category attributes. The experiment indicates that this method is better to express the semantic content of the documents and has a good categorization result.

关键词

词间关系/特征选择/关联规则/文本分类

Key words

relationship between words/ feature selection/ association rules text categorization

分类

信息技术与安全科学

引用本文复制引用

吴双,张文生,徐海瑞..基于词间关系分析的文本特征选择算法[J].计算机工程与科学,2012,34(6):140-145,6.

基金项目

国家自然科学基金资助项目(90924026) (90924026)

计算机工程与科学

OA北大核心CSCDCSTPCD

1007-130X

访问量0
|
下载量0
段落导航相关论文