|国家科技期刊平台
首页|期刊导航|计算机技术与发展|基于多粒度匹配的文本引导服装图像检索

基于多粒度匹配的文本引导服装图像检索OACSTPCD

Text Guided Clothing Image Retrieval Based on Multi-granularity Matching

中文摘要英文摘要

文本引导的图像检索是将查询图像与文本条件集成为多模态查询.现有的方法通过构建更先进的细粒度度量学习来提升性能,但这可能会使模型在文本条件不够精确的情况下对目标图像过拟合,并使得检索结果特征单调.针对该问题,提出了基于特征增强和多粒度匹配的文本引导的服装图像检索方法.首先,根据目标特征的分布,产生服从正态分布的噪声,使其产生小幅度的类内抖动;然后,根据目标特征的波动对增强特征施加约束,波动越大,则对增强特征的惩罚越大,由此得到粗粒度匹配损失;最后,优化学习策略,使用随着训练迭代不断衰减的动态权重将粗粒度与细粒度损失进行统一.通过该方法降低模型对潜在目标图像的排斥,提高特征识别的多样化.在两个公开服装数据集FashionIQ和Shoes上的大量实验表明,该方法能够提高召回率,并且检索结果更丰富.

Text guided image retrieval integrates query images and text conditions into a multimodal query.Existing methods improve performance by constructing more advanced fine-grained metric learning,but this may cause the model to overfit the target image under imprecise text conditions and make the retrieval results feature monotonous.To address this issue,we propose a text guided clothing image retrieval method based on feature enhancement and multi granularity matching.Firstly,based on the distribution of target features,noise following a normal distribution is generated,causing small intra-class jittering.Then,constraints are imposed on the enhanced features based on the fluctuations of the target features.The larger the fluctuations,the greater the penalty for the enhanced features,resulting in coarse-grained matching losses.Finally,we optimize the learning strategy by using dynamic weights that continuously decay with training iterations to unify coarse-grained and fine-grained losses.The proposed method reduces the model's rejection of potential target images and improves the diversity of feature recognition.Extensive experiments on two publicly available clothing datasets,FashionIQ and Shoes,have shown that the proposed method can improve recall rates and provide richer retrieval results.

肖华兴;马丽丽;陈金广

西安工程大学 计算机科学学院,陕西 西安 710048

计算机与自动化

文本引导图像检索特征增强多粒度匹配多模态融合

text guidedimage retrievalfeature enhancementmulti-granularity matchingmulti-modal fusion

《计算机技术与发展》 2024 (007)

24-30 / 7

陕西省自然科学基础研究计划项目(2023-JC-YB-568);陕西省教育厅科研计划项目(22JP028);陕西省计算机学会&翔腾公司基金项目(XT-QC-202309-119287)

10.20165/j.cnki.ISSN1673-629X.2024.0119

评论