| 注册
首页|期刊导航|计算机工程与应用|一种基于集成学习的钓鱼网站检测方法

一种基于集成学习的钓鱼网站检测方法

余恩泽 努尔布力 于清

计算机工程与应用2019,Vol.55Issue(18):81-88,200,9.
计算机工程与应用2019,Vol.55Issue(18):81-88,200,9.DOI:10.3778/j.issn.1002-8331.1812-0362

一种基于集成学习的钓鱼网站检测方法

Phishing Website Detection Method Based on Integrated Learning

余恩泽 1努尔布力 1于清1

作者信息

  • 1. 新疆大学 信息科学与工程学院,乌鲁木齐 830046
  • 折叠

摘要

Abstract

In view of the fake HTTPS websites commonly used by phishing attackers and other obfuscation techniques, this paper draws on the current mainstream methods of detecting phishing websites based on machine learning and rule matching, RMLR and PhishDef, and adds features such as web page text keywords and web page sub-links. The Nmap-RF classification method is proposed. Nmap-RF is an integrated phishing website detection method based on rule matching and random forest method. The website is pre-filtered according to the webpage protocol, and if it is determined to be a phishing website, the subsequent feature extraction step is omitted. Otherwise, the text keyword confidence, the page sub-link confidence, the phishing vocabulary similarity and the page PageRank are taken as key features. The common URL, Whois, DNS information and web page tag information are used as auxiliary features, and are judged by the random forest classification model. Experiments show that the Nmap-RF integration method can detect phishing pages in an average of 9~10 μs, and can filter out 98.4% of illegal pages. The average total accuracy is 99.6%.

关键词

钓鱼网页/集成学习/规则匹配/钓鱼网页混淆技术

Key words

phishing websites/ensemble learning/rule matching/phishing obfuscation techniques

分类

信息技术与安全科学

引用本文复制引用

余恩泽,努尔布力,于清..一种基于集成学习的钓鱼网站检测方法[J].计算机工程与应用,2019,55(18):81-88,200,9.

基金项目

国家自然科学基金(No.61433012,No.61562082,No.61303231). (No.61433012,No.61562082,No.61303231)

计算机工程与应用

OA北大核心CSCDCSTPCD

1002-8331

访问量0
|
下载量0
段落导航相关论文