沈阳工业大学学报2025,Vol.47Issue(5):609-616,8.DOI:10.7688/j.issn.1000-1646.2025.05.08
基于BERT-BiLSTM-CRF的工业控制协议逆向工程
Reverse engineering of industrial control protocols based on BERT-BiLSTM-CRF
摘要
Abstract
[Objective]Industrial control protocol parsing is a critical component of industrial internet security.However,traditional methods suffer from poor universality and low accuracy.These issues lead to a low efficiency in protocol parsing,making it difficult to meet the demands for high precision and adaptability in real-world industrial scenarios.[Methods]A deep learning-based reverse engineering method was proposed for industrial control protocols by integrating a bidirectional encoder representations from transformers(BERT)pre-trained model,a bidirectional long short-term memory(BiLSTM)network,and conditional random fields(CRF).The goal is to enhance the universality and accuracy of protocol parsing,thereby providing technical support for security analysis and vulnerability mining in industrial control systems.First,the BERT pre-trained model was employed to dynamically encode industrial control protocol data into high-dimensional word vector representations,so as to capture the semantic information of the protocol data.Leveraging the powerful contextual understanding capabilities of BERT,the model effectively handled the complexity and diversity of protocol data.Subsequently,a BiLSTM network was utilized to model the relationships between protocol data as well as between protocol data and label data.The BiLSTM network captured long-range dependencies within the protocol data,enabling a better understanding of the structure and semantics of the protocol.Finally,CRF were introduced as constraints to optimize the prediction of protocol formats and semantics.By incorporating transition probabilities between labels,CRF further enhanced prediction accuracy and consistency.The combination of the BERT pre-trained model,BiLSTM network,and CRF enabled the format extraction and semantic analysis of industrial control protocols.Additionally,the proposed method was optimized for large-scale protocol data,which ensured efficiency and stability in complex industrial scenarios.[Results]Experiments were conducted on three typical industrial control protocols.The results demonstrate that the proposed method achieves an accuracy of over 96%in both format extraction and semantic analysis,outperforming traditional methods.The method exhibits high adaptability and accuracy across different protocols,effectively identifying field boundaries and semantic information.[Conclusion]The proposed method significantly improves the universality and accuracy of industrial control protocol parsing,providing reliable technical support for security analysis in industrial control systems.Future work will focus on further optimizing the model,expanding its application scenarios,and enhancing its practicality.关键词
工业控制协议/协议逆向工程/BERT预训练模型/双向长短期记忆网络/条件随机场/词向量/格式提取/语义分析分类
信息技术与安全科学引用本文复制引用
连莲,李素敏,宗学军,何戡..基于BERT-BiLSTM-CRF的工业控制协议逆向工程[J].沈阳工业大学学报,2025,47(5):609-616,8.基金项目
辽宁省自然科学基金项目(2023-MSLH-273) (2023-MSLH-273)
辽宁省科技创新平台建设计划项目(辽科发[2022]36号). (辽科发[2022]36号)