| 注册
首页|期刊导航|计算机工程|基于注意力增强与特征融合的中文医学实体识别

基于注意力增强与特征融合的中文医学实体识别

王晋涛 秦昂 张元 陈一飞 王廷凤 谢承霖 邹刚

计算机工程2024,Vol.50Issue(7):324-332,9.
计算机工程2024,Vol.50Issue(7):324-332,9.DOI:10.19678/j.issn.1000-3428.0067645

基于注意力增强与特征融合的中文医学实体识别

Chinese Medical Entity Recognition Based on Attention Enhancement and Feature Fusion

王晋涛 1秦昂 2张元 1陈一飞 2王廷凤 1谢承霖 3邹刚4

作者信息

  • 1. 中北大学计算机科学与技术学院,山西 太原 030051
  • 2. 湖南省肿瘤医院,湖南 长沙 410031
  • 3. 湖南省中医药研究院附属医院,湖南 长沙 410006
  • 4. 中北大学计算机科学与技术学院,山西 太原 030051||湖南中科助英智能科技研究院,湖南 长沙 410076
  • 折叠

摘要

Abstract

To address problems such as single embedding forms,difficult boundary recognition,and insufficient use of semantic information in Chinese medical named entity recognition models based on character representation,an effective method is to inject lexical features at the bottom of Bret.This approach reduces the impact of word segmentation errors while utilizing word granularity semantic information.However,some low correlation words and noise are introduced when vocabulary information is injected,leading to attention distraction in the Bret model based on the attention mechanism.In addition,it is difficult to fully mine deep semantic information of Chinese characters by relying solely on word granularity.Therefore,this study proposes a Chinese medical entity recognition model based on attention enhancement and feature fusion.The sparse processing of the attention score matrix of words causes the model to focus on words with a high correlation,which can effectively reduce the interference of noisy words in the context.Simultaneously,Convolutional Neural Networks(CNNs)are used to extract the features of Chinese pronunciation and strokes,which are fused with the output features of the Bret model through an iterative attention feature fusion module and subsequently concatenated to the BiLSTM model to further mine the deep semantic information contained in characters.During the experiment,a large number of relevant medical corpora is collected using a crawler and other methods.Further,a medical field word vector library is trained and verified on the CCKS2017 and CCKS2019 datasets.The experimental results show that the F1 values of the model reach 94.90%and 89.37%,respectively,which are higher than those with current mainstream entity recognition models.Therefore,the proposed model exhibits higher recognition performance.

关键词

实体识别/中文分词/注意力稀疏/特征融合/医学词向量库

Key words

entity recognition/Chinese word segmentation/sparse attention/feature fusion/medical word vector library

分类

信息技术与安全科学

引用本文复制引用

王晋涛,秦昂,张元,陈一飞,王廷凤,谢承霖,邹刚..基于注意力增强与特征融合的中文医学实体识别[J].计算机工程,2024,50(7):324-332,9.

基金项目

湖南省自然科学基金(2022JJ70022). (2022JJ70022)

计算机工程

OA北大核心CSTPCD

1000-3428

访问量0
|
下载量0
段落导航相关论文