| 注册
首页|期刊导航|四川大学学报(自然科学版)|基于低维二阶马尔可夫矩阵的加密流量分类方法

基于低维二阶马尔可夫矩阵的加密流量分类方法

郭昊 陈周国 刘智 冷涛 郭先超 张岩峰

四川大学学报(自然科学版)2024,Vol.61Issue(3):30-37,8.
四川大学学报(自然科学版)2024,Vol.61Issue(3):30-37,8.DOI:10.19907/j.0490-6756.2024.030003

基于低维二阶马尔可夫矩阵的加密流量分类方法

Encrypted traffic classification method based on Low-Dimensional Second-order Markov matrix

郭昊 1陈周国 2刘智 1冷涛 3郭先超 3张岩峰3

作者信息

  • 1. 西南石油大学计算机科学学院,成都 610500
  • 2. 中国电子科技集团公司第三十研究所,成都 610041
  • 3. 四川警察学院智能警务四川省重点实验室,泸州 646000
  • 折叠

摘要

Abstract

Network traffic encryption enhances communication security and privacy protection,but also poses new challenges for malicious traffic detection.Machine learning has been successfully applied in various fields,including encrypted traffic classification.However,traditional feature extraction methods may cause important information loss or invalid information redundancy in traffic,which hinders the further improvement of classifi-cation accuracy and efficiency.This paper proposes an encrypted traffic classification method based on a Low-Dimensional Second-order Markov matrix(LDSM),which selects traffic features with high representational abilities to improve the model classification performance.Firstly,the payload of encrypted traffic is extracted and a second-order Markov matrix is constructed according to its hexadecimal character space distribution.Secondly,by computing the Gini gain of each feature in the state transition probability matrix,the feature with the lowest contribution to model training is iteratively deleted,and the feature set with the highest classifica-tion accuracy is selected as the low-dimensional second-order Markov matrix feature.Finally,the effective-ness of the low-dimensional second-order Markov matrix features in model training is verified through experi-ments.In the experiments,a Scikit-learn experimental environment is built and three public datasets:CTU-13,CIC-ISD2017,and CIC IoT Dataset 2023 are used,along with self-collected real network traffic,to ac-complish the task of encrypted traffic classification.The feature dimensionality reduction experiment results show that the LDSM method achieves the best performance with a reduction of the dimensionality of second-order Markov matrix features to 256.After feature dimensionality reduction,the number of original features is only 6.25%,which ensures the model classification accuracy while improving the model training efficiency.Compared with other methods,the experimental results demonstrate that the average accuracy of the LDSM method for traffic classification reaches 98.52%,which is more than 3%higher than other methods.Thus,the LDSM is a feasible and effective method for encrypted traffic classification.

关键词

加密流量/机器学习/马尔可夫/基尼增益/特征降维

Key words

Encrypted traffic/Machine learning/Markov/Gini gain/Feature dimensionality reduction

分类

信息技术与安全科学

引用本文复制引用

郭昊,陈周国,刘智,冷涛,郭先超,张岩峰..基于低维二阶马尔可夫矩阵的加密流量分类方法[J].四川大学学报(自然科学版),2024,61(3):30-37,8.

基金项目

智能警务四川省重点实验室资助项目(ZNJW2022KFQN003) (ZNJW2022KFQN003)

四川大学学报(自然科学版)

OA北大核心CSTPCD

0490-6756

访问量0
|
下载量0
段落导航相关论文