现代电子技术2024,Vol.47Issue(17):88-93,6.DOI:10.16652/j.issn.1004-373x.2024.17.014
基于图像相对位置和负向感知的图文匹配
Image-text matching based on image relative position and negative perception
摘要
Abstract
Image-text matching has attracted extensive attention in the field of computer vision and multimodal information processing.The challenges of the cross-modal task are how to extract the information of vision and text efficiently and how to deal with the inconsistency image-text conflicts.In this paper,a novel image-text matching method is proposed.In this method,the attention mechanism of the relative position of image objects is utilized to avoid ignoring the relative position information of objects in an image,so as to extract visual information more efficiently.Meanwhile,in order to get rid of ignoring the similarity contribution of the unaligned content between image and text,a negative perception module is employed to pay attention to the negative contribution from the relative position of the image objects and the information of unaligned words in the text,so as to improve the accuracy of image-text matching.Finally,the proposed method achieves the best rSum score on the publicly available image-text matching dataset Flickr30K,and it is an improvement of 7.3%over the best negative perception model at present.Therefore,the proposed method is of the most advanced performance.关键词
多模态/图文匹配/视觉信息提取/相对位置编码/注意力机制/负向感知Key words
multimodality/image-text matching/visual information extraction/relative position encoding/attention mechanism/negative perception分类
信息技术与安全科学引用本文复制引用
余超,王铭硕,赵子樵,于清..基于图像相对位置和负向感知的图文匹配[J].现代电子技术,2024,47(17):88-93,6.基金项目
国家973重点研发计划项目(2014CB340506) (2014CB340506)