现代信息科技2025,Vol.9Issue(6):83-88,6.DOI:10.19850/j.cnki.2096-4706.2025.06.016
基于交叉注意力的服装图像条件检索
Clothing Image Conditional Retrieval Based on Cross-Attention
田力仁 1张潇云1
作者信息
- 1. 湖南工业大学 计算机学院,湖南 株洲 412007
- 折叠
摘要
Abstract
Clothing image conditional retrieval refers to the use of text as feedback to guide the adjustment of clothing image content,and using it as a retrieval condition to achieve the purpose of retrieval.The key to the task of clothing image conditional retrieval is how to effectively combine the semantic space of clothing images and text.To this end,this paper proposes a hybrid feature cross-attention clothing image retrieval method using the multimodal pre-training model.Firstly,the multimodal pre-training model is fine-tuned to make it suitable for downstream tasks.Secondly,the cross-attention mechanism is used to obtain the fusion features.Finally,experiments are carried out on the FashionIQ dataset.The results show that the retrieval recall rate of this method is higher and the retrieval results are more accurate.关键词
服装图像/条件检索/交叉注意力/多模态融合Key words
clothing image/conditional retrieval/cross-attention/multimodal fusion分类
信息技术与安全科学引用本文复制引用
田力仁,张潇云..基于交叉注意力的服装图像条件检索[J].现代信息科技,2025,9(6):83-88,6.