| 注册
首页|期刊导航|计算机工程与应用|改进信息增益的维吾尔文特征选择方法

改进信息增益的维吾尔文特征选择方法

韩军兵 哈力旦·阿布都热依木 古力努尔·艾尔肯 何燕

计算机工程与应用2017,Vol.53Issue(23):34-38,5.
计算机工程与应用2017,Vol.53Issue(23):34-38,5.DOI:10.3778/j.issn.1002-8331.1607-0312

改进信息增益的维吾尔文特征选择方法

Improved information gain algorithm based on Uyghur feature selection

韩军兵 1哈力旦·阿布都热依木 1古力努尔·艾尔肯 1何燕1

作者信息

  • 1. 新疆大学 电气工程学院,乌鲁木齐 830047
  • 折叠

摘要

Abstract

Feature selection is the key step of Uyghur text classification, which causes direct effect on the categorization results. To improve the effect of traditional information gain algorithm on the Uyghur feature selection, a new information gain feature selection method is proposed on the basis of deep analysis of Uyghur text feature. This method combines with word frequency in class, characteristics of the distribution coefficient and inverse document frequency, thus traditional information gain is modified. Furthermore, it introduces an alternative features of distribution coefficient to balance the selected number between the classes. Finally, experimental verification is conducted on Uyghur text dataset. The results show that modified information gain algorithm has greatly improved the effect of Uyghur text classification.

关键词

文本分类/信息增益/类词频/倒逆转文档频率/特征选择

Key words

text classification/information gain/word frequency in class/inverse document frequency/feature selection

分类

信息技术与安全科学

引用本文复制引用

韩军兵,哈力旦·阿布都热依木,古力努尔·艾尔肯,何燕..改进信息增益的维吾尔文特征选择方法[J].计算机工程与应用,2017,53(23):34-38,5.

基金项目

国家自然科学基金(No.61163026,No.60865001) (No.61163026,No.60865001)

新疆维吾尔自治区自然科学基金(No.2016D01C048). (No.2016D01C048)

计算机工程与应用

OA北大核心CSCDCSTPCD

1002-8331

访问量0
|
下载量0
段落导航相关论文