| 注册
首页|期刊导航|计算机与数字工程|基于类间分散度和类内集中度的文本特征选择

基于类间分散度和类内集中度的文本特征选择

陈炎龙 段红玉

计算机与数字工程2012,Vol.40Issue(7):6-8,3.
计算机与数字工程2012,Vol.40Issue(7):6-8,3.

基于类间分散度和类内集中度的文本特征选择

Feature Selection Based on Dispersion Degree and Concentration Degree

陈炎龙 1段红玉1

作者信息

  • 1. 郑州牧业工程高等专科学校信息工程系 郑州450011
  • 折叠

摘要

Abstract

Feature selection is one of the key steps in text categorization, selected feature subset directly influences results of text categorization. Firstly, two kinds of feature influence degree were defined: one was the dispersion degree amongst categories, the influence degree that was larger was better. Another was the concentration degre in category, the influence degree that was larger was better. And then, the two kinds of influence degree were integrated organically and a new feature selection method was designed. The method can inspect selected feature synthetically so that the feature set that is more representative is obtained. Simulation experiments show that, to a certain extent, the feature selection method is able to improve performance of text categorization.

关键词

特征选择/文本分类/类间分散度/类内集中度

Key words

feature selection, text categorization, dispersion degree, concentration degree

分类

信息技术与安全科学

引用本文复制引用

陈炎龙,段红玉..基于类间分散度和类内集中度的文本特征选择[J].计算机与数字工程,2012,40(7):6-8,3.

基金项目

河南省基础与前沿技术研究计划项目(编号:102300410266)资助. (编号:102300410266)

计算机与数字工程

1672-9722

访问量0
|
下载量0
段落导航相关论文