基于半监督学习的邮件伪装攻击检测方法OA
Email Masquerade Attack Detection Based on Semi-Supervised Learning
[目的]伪装攻击是电子邮件系统中一种典型攻击,通过非法获取用户真实的身份验证凭证来访问未经授权的服务,造成重大损害.由于邮件使用场景复杂,数据分布不均匀,能获得的标记异常数据数量有限导致邮件系统伪装攻击异常检测困难.[方法]针对上述问题,本文提出了一种基于规则的自训练自动编码器异常检测框架.首先,针对SMTP邮件协议的日志数据,对其应用场景进行分析和分类,并提出粗粒度的标签修正规则.其次,利用自动编码器通过自训练进行迭代检测,通过规则对每次检测结果进行修正.最后,使用核密度估计方法找到合适的阈值减少误报率.[结果]本文使用了6,736个真实企业邮箱账户连续3个月的数据,检测到7个异常账号和12个异常IP地址,与企业安全运营中心(SOC)和3种先进算法比较,效果达到最优.本文方法所检测到的异常账号数量比SOC多75%,同时误报账号减少81.3%.
[Objective]Masquerade attacks are a typical attack in email systems,where attackers illicitly obtain genuine us-er authentication credentials to access unauthorized services,causing significant damage.Due to the complexity of email usage scenarios and the irregular distribution of data,the limited labeled anomaly data makes the detec-tion of masquerade attacks in email systems challenging.[Methods]To solve the above issues,we propose a rule-based self-training Auto-Encoder anomaly detection framework.Initially,the framework analyzes and categorizes scenarios of the SMTP email protocol log data,introducing coarse-grained label correction rules.Subsequently,it employs an Auto-Encoder for iterative detection through self-training,with each detection result refined by rules.Lastly,the kernel density estimation method is utilized to find an appropriate threshold to reduce the false posi-tive rate.[Results]Utilizing data from 6736 real corporate email accounts over three months,the framework de-tected 7 anomalous accounts and 12 anomalous IP addresses.The proposed method detects more than 75%anom-alous accounts compared to those detected by the corporate Security Operations Center(SOC),meanwhile the number of false positive accounts is reduced by 81.3%.
李畅;龙春;赵静;杨悦;王跃达;潘庆峰;叶晓虎;吴铁军;唐宁
中国科学院计算机网络信息中心,北京 100083||中国科学院大学,北京 100039中国科学院计算机网络信息中心,北京 100083论客科技(广州)有限公司,广东广州 511400绿盟科技集团股份有限公司,北京 100089东南大学网络空间安全学院,江苏南京 211189北京天融信网络安全技术有限公司,北京 100193
半监督学习自训练自动编码器伪装攻击邮件协议
semi-supervised learningself-trainingauto-encodermasquerade attackemail protocol
《数据与计算发展前沿》 2024 (002)
56-66 / 11
国家重点研发计划"金融数据全周期流转安全风险评估监测与溯源技术研究"(2023YFC3304704);中国科学院网络安全与信息化基金会"网络安全保障体系建设工程"(CAS-WX2022GC-04);中国科学院战略性先导科技专项"生物数据存储管理与交互利用系统"(XDB38030000)
评论