| 注册
首页|期刊导航|计算机工程|信息过滤中基于统计与规则的关键词抽取研究

信息过滤中基于统计与规则的关键词抽取研究

黄先珍 杨玉珍 刘培玉

计算机工程2012,Vol.38Issue(2):57-59,3.
计算机工程2012,Vol.38Issue(2):57-59,3.DOI:10.3969/j.issn.1000-3428.2012.02.018

信息过滤中基于统计与规则的关键词抽取研究

Study of Keywords Extraction Based on Statistics and Rules in Information Filtering

黄先珍 1杨玉珍 2刘培玉3

作者信息

  • 1. 菏泽学院计算机与信息工程系,山东菏泽274015
  • 2. 山东师范大学信息科学与工程学院,济南250014
  • 3. 山东省分布式计算机软件新技术重点实验室,济南250014
  • 折叠

摘要

Abstract

Currently, the items selection and calculation of weight are divided by most studies in Vector Space Model(VSM). Defects, such as the semantic vacancy of words after segmentation and low degree of differentiation based on the methods of frequency-based weight calculation, are caused. To overcome this shortcoming, a method of keywords extraction based on statistics and rules is proposed. The basic phrases are extracted by the rules of phrase syntax and instead of the words as terms in this method. Full account of feature frequency, position, distribution and grammatical role or other information, a joint feature weight function is constructed, to improve the differentiation of terms and weaken the semantic vacancy of words. Experimental results show that the keywords based on statistics and rules are more effective than others in the text information filtering.

关键词

基本短语/合并规则/角色加权/分布加权/位置加权

Key words

base phrase/merging rule/role weighted/distribution weighted/position weighted

分类

信息技术与安全科学

引用本文复制引用

黄先珍,杨玉珍,刘培玉..信息过滤中基于统计与规则的关键词抽取研究[J].计算机工程,2012,38(2):57-59,3.

基金项目

国家自然科学基金资助项目(60873247) (60873247)

山东省高新自主创新专项工程基金资助项目(2008ZZ28) (2008ZZ28)

计算机工程

OACSCDCSTPCD

1000-3428

访问量0
|
下载量0
段落导航相关论文