计算机工程2024,Vol.50Issue(7):324-332,9.DOI:10.19678/j.issn.1000-3428.0067645
基于注意力增强与特征融合的中文医学实体识别
Chinese Medical Entity Recognition Based on Attention Enhancement and Feature Fusion
摘要
Abstract
To address problems such as single embedding forms,difficult boundary recognition,and insufficient use of semantic information in Chinese medical named entity recognition models based on character representation,an effective method is to inject lexical features at the bottom of Bret.This approach reduces the impact of word segmentation errors while utilizing word granularity semantic information.However,some low correlation words and noise are introduced when vocabulary information is injected,leading to attention distraction in the Bret model based on the attention mechanism.In addition,it is difficult to fully mine deep semantic information of Chinese characters by relying solely on word granularity.Therefore,this study proposes a Chinese medical entity recognition model based on attention enhancement and feature fusion.The sparse processing of the attention score matrix of words causes the model to focus on words with a high correlation,which can effectively reduce the interference of noisy words in the context.Simultaneously,Convolutional Neural Networks(CNNs)are used to extract the features of Chinese pronunciation and strokes,which are fused with the output features of the Bret model through an iterative attention feature fusion module and subsequently concatenated to the BiLSTM model to further mine the deep semantic information contained in characters.During the experiment,a large number of relevant medical corpora is collected using a crawler and other methods.Further,a medical field word vector library is trained and verified on the CCKS2017 and CCKS2019 datasets.The experimental results show that the F1 values of the model reach 94.90%and 89.37%,respectively,which are higher than those with current mainstream entity recognition models.Therefore,the proposed model exhibits higher recognition performance.关键词
实体识别/中文分词/注意力稀疏/特征融合/医学词向量库Key words
entity recognition/Chinese word segmentation/sparse attention/feature fusion/medical word vector library分类
信息技术与安全科学引用本文复制引用
王晋涛,秦昂,张元,陈一飞,王廷凤,谢承霖,邹刚..基于注意力增强与特征融合的中文医学实体识别[J].计算机工程,2024,50(7):324-332,9.基金项目
湖南省自然科学基金(2022JJ70022). (2022JJ70022)