| 注册
首页|期刊导航|计算机应用研究|基于多视图典型相关分析的垃圾网页检测

基于多视图典型相关分析的垃圾网页检测

高爽 张化祥 房晓南

计算机应用研究2013,Vol.30Issue(3):810-813,4.
计算机应用研究2013,Vol.30Issue(3):810-813,4.DOI:10.3969/j.issn.1001-3695.2013.03.043

基于多视图典型相关分析的垃圾网页检测

Multi-view canonical correlation analysis based Web spam detection

高爽 1张化祥 2房晓南3

作者信息

  • 1. 山东师范大学信息科学与工程学院,济南250014
  • 2. 山东省分布式计算机软件新技术重点实验室,济南250014
  • 折叠

摘要

Abstract

Firstly this paper divided the features of Web spam pages into the content feature based view and the link feature based view. And it employed canonical correlation analysis and promotion methods for feature extraction to generate two new feature sets for each Web page. Then it implemented different combinations of the two new feature sets of Web pages to produce a single view for Web pages, which used to construct classification algorithms. Experimental results show that considering Web page data as two view data and applying multi-view canonical correlation analysis techniques can effectively improve the recognition accuracy of Web spam.

关键词

垃圾网页检测/典型相关分析/多视图分类/特征抽取

Key words

Web spam detection/ canonical correlation analysis(CCA) / multi-view classification/ feature extraction

分类

信息技术与安全科学

引用本文复制引用

高爽,张化祥,房晓南..基于多视图典型相关分析的垃圾网页检测[J].计算机应用研究,2013,30(3):810-813,4.

基金项目

国家自然科学基金资助项目(61170145) (61170145)

国家教育部高等学校博士点专项基金资助项目(20113704110001) (20113704110001)

山东省自然科学基金和科技攻关计划资助项目(ZR2010FM021,2008B0026,2010G0020115) (ZR2010FM021,2008B0026,2010G0020115)

计算机应用研究

OA北大核心CSCDCSTPCD

1001-3695

访问量0
|
下载量0
段落导航相关论文