北京信息科技大学学报(自然科学版)2025,Vol.40Issue(6):79-89,11.DOI:10.16508/j.cnki.11-5866/n.2025.06.009
基于显隐式双路径融合的多模态命名实体识别
Multimodal named entity recognition based on explicit-implicit dual-path fusion
摘要
Abstract
To address the issues that existing text-based named entity recognition methods struggle to effectively utilize visual information,and mainstream multimodal named entity recognition(MNER)methods suffer from insufficient cross-modal semantic association mining,limited heterogeneous data fusion capabilities,and susceptibility to the modal semantic gap,a multimodal named entity recognition model based on explicit-implicit dual-path fusion(DPF-MNER)was proposed.A dual-path fusion mechanism for deep cross-modal alignment was adopted:in the explicit path,a target entity-lexical relationship graph was built to explicitly model the semantic correspondence between text entities and image regions.In the implicit path,a hard sample alignment mechanism based on momentum contrastive learning was designed.By maintaining a cross-modal memory bank through momentum update,the model was guided to bring related image-text pairs closer and push irrelevant ones farther apart in the shared semantic space,thus alleviating modal deviation.Experimental results on the constructed military domain-specific dataset ME-MNER and the public Twitter-2017 dataset show that DPF-MNER achieves F1 scores of 87.05%and 86.35%,respectively,verifying the method's effectiveness in enhancing entity recognition accuracy and model generalization capability.关键词
多模态命名实体识别/对比学习/跨模态对齐/显隐式融合Key words
multimodal named entity recognition(MNER)/contrastive learning/cross-modal alignment/explicit-implicit fusion分类
信息技术与安全科学引用本文复制引用
陈强,谷晓燕,杨溢..基于显隐式双路径融合的多模态命名实体识别[J].北京信息科技大学学报(自然科学版),2025,40(6):79-89,11.基金项目
装备预研领域基金项目(61403120404) (61403120404)