吉林大学学报(信息科学版)2023,Vol.41Issue(6):1112-1119,8.
面向不平衡数据集的网络入侵检测算法
Network Intrusion Detection Algorithm for Imbalanced Datasets
摘要
Abstract
A network intrusion detection algorithm that combines systematic data pre-processing and hybrid sampling is proposed for the problem of class imbalance in intrusion detection datasets.Based on the feature distribution of the intrusion detection dataset,the feature values are systematically processed as follows:for the three categorical features,"Proto","Service"and"State",minor categories within each feature are combined to reduce the total dimension of one-hot encoding;the 18 extremely distributed numerical features are processed with logarithm and then standardized according to the numerical distribution.The class imbalance processing technology,which combines Nearmiss-1 under-sampling and SMOTE(Synthetic Minority Over-sampling Technique)is designed.Each class of samples in the training dataset is divided into sub-classes based on the"Proto","Service"and"State"categorical features,and each sub-class is under-sampled or oversampled in equal proportion.The intrusion detection model PSSNS-RF(Nearmiss and SMOTE based on Proto,Service,State-Random Forest)is built,which achieves a 97.02%multiclass detection rate in the UNSW-NB15 dataset,resolving the data imbalance problem and significantly improving the detection rate of minority classes.关键词
网络入侵检测/不平衡数据集/特征选择/网络安全Key words
network intrusion detection/imbalanced dataset/feature selection/network security分类
信息技术与安全科学引用本文复制引用
徐忠原,杨秀华,王业,李玲..面向不平衡数据集的网络入侵检测算法[J].吉林大学学报(信息科学版),2023,41(6):1112-1119,8.基金项目
吉林省科技发展计划基金资助项目(20190302073GX) (20190302073GX)