| 注册
首页|期刊导航|计算机应用与软件|基于模糊SVDD监督的PageRank主题爬虫算法

基于模糊SVDD监督的PageRank主题爬虫算法

汪伟 魏岩 杨煜普

计算机应用与软件Issue(3):16-19,30,5.
计算机应用与软件Issue(3):16-19,30,5.DOI:10.3969/j.issn.1000-386x.2014.03.005

基于模糊SVDD监督的PageRank主题爬虫算法

PAGERANK FOCUSED CRAWLER ALGORITHM BASED ON FUZZY SVDD SUPERVISION

汪伟 1魏岩 1杨煜普1

作者信息

  • 1. 上海交通大学自动化系系统控制与信息处理教育部重点实验室 上海 200240
  • 折叠

摘要

Abstract

Focused crawler is a web crawler to collect resources from specific fields.In order to ensure focused crawler's precision,the arti-cle proposes a PageRank crawler algorithm based on fuzzy SVDD(support vector domain description)supervision,which not only considers the linking relations among pages,but also uses classifier supervision to prevent crawler from departing from focus.Compared by experiments with keyword matching focused crawler,shark-search focused crawler,PageRank focused crawler,SVMprediction based focused crawler and ordinary SVDD instructed focused crawler,it is validated that the proposed algorithm is more precise.

关键词

模糊 SVDD/PageRank/主题爬虫

Key words

Fuzzy SVDD/PageRank/Focused crawler

分类

信息技术与安全科学

引用本文复制引用

汪伟,魏岩,杨煜普..基于模糊SVDD监督的PageRank主题爬虫算法[J].计算机应用与软件,2014,(3):16-19,30,5.

基金项目

国家高技术研究发展计划项目(2011AA 040605)。 ()

计算机应用与软件

OACSCDCSTPCD

1000-386X

访问量0
|
下载量0
段落导航相关论文