计算机与现代化Issue(3):24-28,5.DOI:10.3969/j.issn.1006-2475.2024.03.004
融合词典信息和句子语义的中文命名实体识别
Chinese Named Entity Recognition with Fusion of Lexicon Information and Sentence Semantics
摘要
Abstract
The performance of named entity recognition tasks has significantly improved due to the rapid advancement of deep learning techniques.However,the outstanding results achieved by deep learning networks often rely on large amounts of labeled samples,making it challenging to fully exploit deep information in small datasets.In this paper,we propose a Chinese named en-tity recognition model(LS-NER)that combines lexicon and sentence semantics.Firstly,potential words matched by characters in the dictionary serve as a priori lexical information for the model,addressing the Chinese word segmentation issue.Then,sen-tence embeddings containing semantic information,typically used for calculating text similarity,are applied to the named entity recognition task,enabling the model to identify similar entities from analogous sentences.Finally,a feature fusion strategy is de-vised to allow the model to effectively learn the semantic information provided by sentence embeddings.The experimental results demonstrate that our approach achieves commendable performance on both small datasets Resume and Weibo.The incorporation of sentence semantics assists the model in learning deeper features without requiring additional external information,resulting in F1 scores that are 0.15 percentage points and 2.26 percentage points higher than those of the model without added sentence infor-mation,respectively.关键词
命名实体识别/BERT/SoftLexicon/Sentence-Bert/条件随机场Key words
named entity recognition/BERT/SoftLexicon/Sentence-Bert/CRF分类
信息技术与安全科学引用本文复制引用
王谭,陈金广,马丽丽..融合词典信息和句子语义的中文命名实体识别[J].计算机与现代化,2024,(3):24-28,5.基金项目
陕西省自然科学基础研究计划项目(2023-JC-YB-568) (2023-JC-YB-568)
陕西省教育厅科研计划项目(22JP028) (22JP028)