| 注册

图文跨模态检索的联合特征方法

高迪辉 盛立杰 许小冬 苗启广

西安电子科技大学学报(自然科学版)2024,Vol.51Issue(4):128-138,11.
西安电子科技大学学报(自然科学版)2024,Vol.51Issue(4):128-138,11.DOI:10.19665/j.issn1001-2400.20240302

图文跨模态检索的联合特征方法

Joint feature approach for image-text cross-modal retrieval

高迪辉 1盛立杰 1许小冬 1苗启广1

作者信息

  • 1. 西安电子科技大学 计算机科学与技术学院,陕西 西安 710071||西安电子科技大学 大数据与视觉智能关键技术重点实验室,陕西 西安 710071
  • 折叠

摘要

Abstract

With the rapid development of deep learning,cross-modal retrieval performance has been significantly improved.However,existing methods only match the image text as a whole or only use local information for matching,there are limitations in the use of graphic and textual information,and the retrieval performance needs to be further improved.In order to fully exploit the potential semantic relationship between images and texts,this paper proposes a cross-modal retrieval model based on joint features.In the feature extraction part,two sub-networks are used to deal with the local features and global features of images and texts respectively,and a bilinear layer structure based on the attention mechanism is designed to filter redundant information.In the loss function part,the triplet ranking loss and semantic label classification loss are used to realize feature joint optimization.And the proposed model has a wide range of generality,which can effectively improve the performance of the model only based on local information.A series of experimental results on the public datasets Flickr30k and MS COCO show that the proposed model effectively improves the performance of cross-modal image-text retrieval tasks.In the Flickr30k dataset retrieval task,the proposed model improves 5.1% on the R@1 metric for text retrieval and 2.8% on the R@1 metric for image retrieval.

关键词

跨模态检索/深度学习/自注意力网络/图像检索

Key words

cross-modal retrieval/deep learning/self-attention network/image retrieval

分类

信息技术与安全科学

引用本文复制引用

高迪辉,盛立杰,许小冬,苗启广..图文跨模态检索的联合特征方法[J].西安电子科技大学学报(自然科学版),2024,51(4):128-138,11.

基金项目

国家自然科学基金(62272364) (62272364)

陕西高等继续教育教学改革研究课题(21XJZ004) (21XJZ004)

西安电子科技大学学报(自然科学版)

OA北大核心CSTPCD

1001-2400

访问量0
|
下载量0
段落导航相关论文