西南交通大学学报2011,Vol.46Issue(3):456-460,5.DOI:10.3969/j.issn.0258-2724.2011.03.017
基于中心性和PageRank的网页综合评分方法
Hybrid Page Scoring Algorithm Based on Centrality and PageRank
摘要
Abstract
In order to score Web pages in an effective manner, a new page scoring algorithm, CentralRank, was proposed based on centrality measures, including degree, betweenness and closeness, and the PageRank algorithm. The CentralRank algorithm computes the importance of pages in Web social networks based on the centrality measures and employs the PageRank algorithm to accurately score Web pages. To verify the performance of the CentralRank algorithm, a Web crawler was developed to automatically and effectively crawl Web pages. The Web crawler contains three essential techniques, that is, Web data collection, content analysis and duplicate page detection. Experiments on real data show that the CentralRank algorithm can guarantee less time deficiency and is more exact in scoring Web pages than the centrality measures-based page ranking algorithm and the PageRank algorithm with an average improvement of 14.2% and 7.5% , respectively.关键词
社会网络分析/Web社会网络/中心性/PageRank算法/网页评分Key words
social network analysis/ Web social network/ centrality/ PageRank algorithm/ Web page scoring分类
信息技术与安全科学引用本文复制引用
乔少杰,彭京,李天瑞,李红,李太勇,王超..基于中心性和PageRank的网页综合评分方法[J].西南交通大学学报,2011,46(3):456-460,5.基金项目
中国博士后科学基金资助项目(20090461346) (20090461346)
中央高校基本科研业务费专项资金资助项目(SWJTU09CX035) (SWJTU09CX035)
教育部人文社会科学研究青年基金资助项目(10YJCZH117) (10YJCZH117)
四川省青年科技基金资助项目(08ZQ026-016) (08ZQ026-016)
四川省科技支撑计划项目(2010GZ0123) (2010GZ0123)