网络安全与数据治理2024,Vol.43Issue(12):54-59,6.DOI:10.19358/j.issn.2097-1788.2024.12.008
基于多头卷积残差连接的文本数据实体识别
Text data entity recognition based on muti-head convolution residual connections
刘微 1李波 1杨思瑶1
作者信息
- 1. 沈阳理工大学 信息科学与工程学院,辽宁 沈阳 110158
- 折叠
摘要
Abstract
To construct a relational database for text data in work reports,and address the problem of extracting useful information entities from unstructured text and feature loss in traditional networks during information extraction,a deep learning-based entity recognition model,which is named RoBERTa-MCR-BiGRU-CRF is proposed.The model firstly uses the pre-trained model Ro-bustly Optimized BERT Pretraining Approach(RoBERTa)as an encoder,feeding the trained word embeddings into the Multi-head Convolutional Residual network(MCR)layer to enrich semantic information.Next,the embeddings are input into a gated recurrent Bidirectional Gated Recurrent Unit(BiGRU)layer to further capture contextual features.Finally,a Conditional Ran-dom Field(CRF)layer is used for decoding and label prediction.Experimental results show that the model achieves an F1 score of 96.64%on the work report dataset,outperforming other comparative models.Additionally,for named entity categories in the data,the F1 score is 3.18%and 2.87%higher than BERT-BiLSTM-CRF and RoBERTa-BiGRU-CRF,respectively.The results demonstrate the model's effectiveness in extracting useful information from unstructured text.关键词
深度学习/命名实体识别/神经网络/数据挖掘Key words
deep learning/named entity recognition/neural networks/data mining分类
信息技术与安全科学引用本文复制引用
刘微,李波,杨思瑶..基于多头卷积残差连接的文本数据实体识别[J].网络安全与数据治理,2024,43(12):54-59,6.