| 注册
首页|期刊导航|计算机与数字工程|基于朴素贝叶斯算法的主题爬虫的研究

基于朴素贝叶斯算法的主题爬虫的研究

皮靖 邵雄凯 肖雅夫

计算机与数字工程2012,Vol.40Issue(6):76-78,123,4.
计算机与数字工程2012,Vol.40Issue(6):76-78,123,4.

基于朴素贝叶斯算法的主题爬虫的研究

Research on Focused Crawler Based on Naive Bayes Algorithm

皮靖 1邵雄凯 1肖雅夫1

作者信息

  • 1. 湖北工业大学计算机学院 武汉430068
  • 折叠

摘要

Abstract

Focused crawler is a key part of the focused search engine. This paper proposed a method of using Naive Bayes algorithm to identify topics, introduced the core part of the focused crawler, including the generation of seed URL collection, the page analysis and feature extraction and the topic identify. Compared the focused crawler based on Naive Bayes algorithm with the focused crawler base on links analysis and thesaurus, the experiment result proved that the focused crawler based on Naive Bayes algorithm has better accuracy and the method is feasible. It laid good foundation for the topic information collection.

关键词

朴素贝叶斯算法/主题爬虫/主题相关度/信息采集

Key words

Naive Bayes algorithm/ focused crawler/ topic correlativity/ information collection

分类

信息技术与安全科学

引用本文复制引用

皮靖,邵雄凯,肖雅夫..基于朴素贝叶斯算法的主题爬虫的研究[J].计算机与数字工程,2012,40(6):76-78,123,4.

计算机与数字工程

1672-9722

访问量0
|
下载量0
段落导航相关论文