| 注册
首页|期刊导航|智能系统学报|基于遗传算法优化综合启发式的中文网页特征提取

基于遗传算法优化综合启发式的中文网页特征提取

沈高峰 谷淑敏

智能系统学报Issue(4):474-479,6.
智能系统学报Issue(4):474-479,6.DOI:10.3969/j.issn.1673-4785.201305044

基于遗传算法优化综合启发式的中文网页特征提取

Chinese Web page feature extraction by optimizing comprehensive heuristisc based on GA

沈高峰 1谷淑敏2

作者信息

  • 1. 郑州轻工业学院计算机与通信工程学院,河南郑州450002
  • 2. 中原工学院信息商务学院基础学科部,河南郑州450007
  • 折叠

摘要

Abstract

Feature extraction is the basis of such technologies as information retrieval , text classification , text clus-tering and automatic summarization .Aiming at the shortcomings of the traditional feature extraction methods which make it difficult to test feature words comprehensively and effectively , this paper proposes a method for extracting Chinese web page features by optimizing the comprehensive heuristic features based on GA .This proposed method employs comprehensive heuristics of word frequency , word correlation, parts of speech (POS) and position features to comprehensively test selected features and uses GA to optimize the weight of each heuristic parameter .The exper-imental results of the different test sets show that the proposed method can effectively avoid the derivations of the traditional extraction methods and obtain more representative features , and therefore it has a certain practical value .

关键词

特征提取/遗传算法/文本分类/文本聚类/词频/关联度

Key words

feature extraction/GA/text classification/text clustering/word frequency/word correlation

分类

信息技术与安全科学

引用本文复制引用

沈高峰,谷淑敏..基于遗传算法优化综合启发式的中文网页特征提取[J].智能系统学报,2014,(4):474-479,6.

基金项目

河南省基础与前沿技术研究计划项目(102300410266);郑州轻工业学院博士科研基金资助项目. ()

智能系统学报

OA北大核心CSCDCSTPCD

1673-4785

访问量0
|
下载量0
段落导航相关论文