计算机应用研究2023,Vol.40Issue(12):3696-3700,3705,6.DOI:10.19734/j.issn.1001-3695.2023.04.0179
基于字符距离聚类的未知工控协议分类方法
Character distance clustering-based classification algorithm for unknown industrial control protocols
摘要
Abstract
The classification of unknown industrial control protocol is the premise of realizing multi-type mixed industrial con-trol protocol identification.Based on the brief and simple format of industrial control protocol messages with binary characters,this paper proposed an unknown industrial control protocol classification method based on character distance clustering.Pre-vious classification algorithms mainly calculated the Euclidean distance of text protocols,which couldn't accurately reflect the similarity of unknown industrial control protocol messages.In contrast,the proposed algorithm realized unknown industrial control protocol classification by constructing the sequence of binary features sequences,calculating their character distances and performing K-means clustering.To guarantee the classification accuracy,it proposed an algorithm determining the optimal clustering K value based on the maximum average character distance.Semi-physical simulation results show that the protocol classification accuracy for unknown industrial control protocol classification can reach 96.80%,while the protocol type identi-fication accuracy can reach 97.07%.关键词
工控协议/协议分类/字符距离/K-means聚类Key words
industrial control protocol/protocol classification/character distance/K-means clustering分类
信息技术与安全科学引用本文复制引用
屠雅春,许驰,杜昕宜,王倚天,夏长清,金曦..基于字符距离聚类的未知工控协议分类方法[J].计算机应用研究,2023,40(12):3696-3700,3705,6.基金项目
国家自然科学基金资助项目(92267108,62173322,61972389,62133014) (92267108,62173322,61972389,62133014)
辽宁省科学计划资助项目(2023JH3/1020004,2023JH3/10200006,2022JH25/10100005) (2023JH3/1020004,2023JH3/10200006,2022JH25/10100005)
中国科学院青年创新促进会资助项目(2019202,2020207,Y2021062) (2019202,2020207,Y2021062)