首页|期刊导航|计算机应用研究|图文语义增强的多模态命名实体识别方法

图文语义增强的多模态命名实体识别方法

徐玺王海荣王彤马赫

计算机应用研究2024，Vol.41Issue(6)：1679-1685,7.

计算机应用研究2024，Vol.41Issue(6)：1679-1685,7.DOI:10.19734/j.issn.1001-3695.2023.09.0439

图文语义增强的多模态命名实体识别方法

Textual-visual semantics-enhanced multimodal named entity recognition method

徐玺 ¹王海荣 ²王彤 ¹马赫¹

作者信息

1. 北方民族大学计算机科学与工程学院,银川 750021
2. 北方民族大学计算机科学与工程学院,银川 750021||北方民族大学图像图形智能处理国家民委重点实验室,银川 750021
折叠

摘要

Abstract

To address the issues of missing textual-visual semantics and unclear multimodal representation semantics in multi-modal named entity recognition methods,this paper proposed a method of textual-visual semantic enhancement for multimodal named entity recognition.In this method,it used various pre-trained models to extract text features,character features,re-gional visual features,image keywords and visual labels,in order to comprehensively describe the semantic information of image-text data.It adopted the Transformer and cross-modal attention mechanism to mine the complementary semantic relation-ships between image-text features,guiding feature fusion,thereby generating semantically complete text representations and semantically enhanced multimodal representations.By integrating boundary detection,entity type detection,and named entity recognition tasks,it constructed a multi-task label decoder,which could perform fine-grained semantic decoding of input fea-tures,to improve the semantic accuracy of predicted features.It used this decoder to jointly decode text representations and multimodal representations to obtain globally optimal predicted labels.A large number of experimental results on the Twitter-2015 and Twitter-2017 benchmark datasets show that the proposed method has increased the average F,score by 1.00％and 1.41％respectively,which indicates that the model has a strong capability for named entity recognition.

关键词

多模态命名实体识别/多模态表示/多模态融合/多任务学习/命名实体识别

Key words

multimodal named entity recognition/multimodal representation/multimodal fusion/multi-task learning/named entity recognition

分类

信息技术与安全科学

引用本文复制引用

徐玺,王海荣,王彤,马赫..图文语义增强的多模态命名实体识别方法[J].计算机应用研究,2024,41(6):1679-1685,7.

基金项目

宁夏自然科学基金资助项目(2023AAC03316) （2023AAC03316）

北方民族大学中央高校基本科研业务费专项资金资助项目(2022PT_S04) （2022PT_S04）

计算机应用研究

OA北大核心CSTPCD

ISSN：1001-3695

访问量1

下载量0

段落导航