| 注册
首页|期刊导航|计算机应用与软件|一种改进的TFIDF网页关键词提取方法

一种改进的TFIDF网页关键词提取方法

李静月 李培峰 朱巧明

计算机应用与软件2011,Vol.28Issue(5):25-27,3.
计算机应用与软件2011,Vol.28Issue(5):25-27,3.

一种改进的TFIDF网页关键词提取方法

AN IMPROVED TFIDF-BASED APPROACH TO EXTRACT KEY WORDS FROM WEB PAGES

李静月 1李培峰 1朱巧明1

作者信息

  • 1. 苏州大学计算机科学与技术学院,江苏,苏州,215006
  • 折叠

摘要

Abstract

Although the classical TFIDF-based keywords extraction method is easy to implement and has low time cost,its results are not good enough to acquire features which play key roles in the text content. This paper proposes such a method,it uses the improved TFIDF formula to extract keywords by means of extended synonym dictionary and based on the considerations of the structure feature of Chinese texts and the lexical category feature of Chinese words and phrases. Experimental results show that our method outperforms the classical method evidently, and is able to extract satisfied results.

关键词

文本结构/关键词抽取/TFIDF

Key words

Text structure/ Keywords extraction/ Term frequency-inverse document frequency (TFIDF)

引用本文复制引用

李静月,李培峰,朱巧明..一种改进的TFIDF网页关键词提取方法[J].计算机应用与软件,2011,28(5):25-27,3.

基金项目

国家自然科学基金(90920004,60970056,60873150) (90920004,60970056,60873150)

江苏省自然科学基金(BK2008160) (BK2008160)

江苏省高校自然科学重大基础研究项目(08KJA520002). (08KJA520002)

计算机应用与软件

OA北大核心CSCDCSTPCD

1000-386X

访问量0
|
下载量0
段落导航相关论文