| 注册
首页|期刊导航|山西大学学报(自然科学版)|基于SoftLexicon和对抗训练的中文医疗命名实体识别

基于SoftLexicon和对抗训练的中文医疗命名实体识别

潘世鹏 吐尔地·托合提 梁毅 艾斯卡尔·艾木都拉

山西大学学报(自然科学版)2024,Vol.47Issue(2):260-268,9.
山西大学学报(自然科学版)2024,Vol.47Issue(2):260-268,9.DOI:10.13451/j.sxu.ns.2023165

基于SoftLexicon和对抗训练的中文医疗命名实体识别

Chinese Medical Named Entity Recognition Based on Soft-Lexicon and Adversarial Training

潘世鹏 1吐尔地·托合提 1梁毅 1艾斯卡尔·艾木都拉1

作者信息

  • 1. 新疆大学 计算机科学与技术学院,新疆 乌鲁木齐 830017||新疆多语种信息技术重点实验室,新疆 乌鲁木齐 830017
  • 折叠

摘要

Abstract

In existing medical entity recognition models,most of them cannot fully extract and utilize the lexical information in the text sequence,and their model structures are complex.This makes these models face problems such as inaccurate entity boundary recognition and poor robustness when dealing with medical texts.Additionally,most word-granularity based named entity recogni-tion(NER)methods are not perfect in solving the problem of information omission.To address these problems,a named entity rec-ognition model based on word fusion and adversarial training is proposed in this paper.The model uses a pre-trained model BERT to obtain word vectors of text sequences.Then the SoftLexicon is used to introduce lexical information and add perturbation samples generated by adversarial training to the word vectors.Finally,the BiLSTM-CRF is used to extract features and obtain sequence anno-tation results.The proposed model is experimented on the datasets CCKS2019 and CCKS2020,where the F1 values reach 85.07%and 90.39%,respectively.The experimental results show that compared with the baseline model,the F1 value of this model has in-creased by 2.31%and 2.88%,indicating that the combination of word fusion method and adversarial training can effectively identify medical entities.

关键词

命名实体识别/字词融合/对抗训练/PGD

Key words

named entity recognition/word and character fusion/adversarial training/Projected Gradient Descent(PGD)

分类

计算机与自动化

引用本文复制引用

潘世鹏,吐尔地·托合提,梁毅,艾斯卡尔·艾木都拉..基于SoftLexicon和对抗训练的中文医疗命名实体识别[J].山西大学学报(自然科学版),2024,47(2):260-268,9.

基金项目

国家自然科学基金(62166042 ()

U2003207) ()

新疆维吾尔自治区自然科学基金(2021D01C076) (2021D01C076)

国防科技基金加强计划(2021-JCJQ-JJ-0059) (2021-JCJQ-JJ-0059)

山西大学学报(自然科学版)

OACSTPCD

0253-2395

访问量0
|
下载量0
段落导航相关论文