计算机工程2024,Vol.50Issue(2):337-344,8.DOI:10.19678/j.issn.1000-3428.0067285
基于多粒度字形增强的中文医学命名实体识别
Chinese Medical Named Entity Recognition Based on Multi-Granularity Glyph Enhancement
摘要
Abstract
Chinese Medical Named Entity Recognition(CMNER)focuses on extracting entities from unstructured Chinese medical texts.Current character-based CMNER models inadequately address the distinct features of Chinese characters from various angles,thereby limiting their efficacy in CMNER applications.To address this,a model leveraging multigranular glyph information enhancement for Chinese medical named entity recognition is introduced.This model integrates the glyph spatial structure and radical representation of Chinese characters,aligning them with domain-specific lexicon-based word information.This approach enriches the semantic and boundary potential of characters.Through a gating mechanism,the model effectively combines domain-specific terms with the multifaceted glyph features of Chinese characters,ensuring comprehensive consideration of both domain relevance and intrinsic character details,thereby enhancing its capacity for medical entity recognition.The model employs multigranular glyph-enhanced character representations in the Bidirectional Long Short-Term Memory(BiLSTM)and Conditional Random Field(CRF)layers for contextual encoding and label decoding,respectively.Experimental results demonstrate that the proposed model surpasses the best baseline model,achieving an increase in F1 scores of 1.04% and 0.62% on the IMCS21 and CMeEE datasets,respectively.Ablation studies further confirm the efficacy of each component,highlighting the model's superiority in recognizing Chinese medical named entities.关键词
命名实体识别/医学领域/字形结构/门控机制/领域词典Key words
named entity recognition/medical domain/glyph structure/gating mechanism/domain lexicon分类
信息技术与安全科学引用本文复制引用
刘威,马磊,李凯,李蓉..基于多粒度字形增强的中文医学命名实体识别[J].计算机工程,2024,50(2):337-344,8.基金项目
国家自然科学基金(62266025) (62266025)
云南省重大科技专项计划项目(202202AD080004,202202AE090008) (202202AD080004,202202AE090008)
云南省基础研究计划(昆医联合专项)(202201AY070001-258). (昆医联合专项)