江苏大学学报(自然科学版)2024,Vol.45Issue(1):77-84,8.DOI:10.3969/j.issn.1671-7775.2024.01.011
基于多头注意力机制字词联合的中文命名实体识别
Chinese named entity recognition based on multi-head attention character-word integration
摘要
Abstract
To solve the problems that the existing Chinese named entity recognition(NER)methods based on character-word integration with introducing redundant word interference,complex model architecture and difficult combining with other sequence models,a novel Chinese NER algorithm based on multi-head attention was proposed.The attention mechanism was used to efficiently fuse word boundary information and reduce the interference of redundant word by fusing BIE word sets at different locations.A multi-head attention character-word joint model was established with character-word integrating modules,multi-head attention modules and fusion modules.Compared with the existing Chinese NER schemes,the proposed algorithm could avoid the design of complex sequence models,which was convenient to combine with the existing character based Chinese NER models.The recall,precision and F1 value were used as evaluation indicators,and the effects of each part of the model were verified by ablation experiments.The results show that by the proposed algorithm,the F1 values are increased by 0.28 and 0.69 on MSRA and Weibo,respectively,and the precision is improved by 0.07 on Resume data set.关键词
中文命名实体识别/词汇冗余/词汇边界信息/字词联合/多头注意力机制/BIE词集Key words
Chinese named entity recognition/redundant word interference/word boundary information/character-word integration/multi-head attention/BIE word sets分类
信息技术与安全科学引用本文复制引用
王进,王猛旗,张昕跃,孙开伟,朴昌浩..基于多头注意力机制字词联合的中文命名实体识别[J].江苏大学学报(自然科学版),2024,45(1):77-84,8.基金项目
国家自然科学基金资助项目(61806033) (61806033)