| 注册
首页|期刊导航|东华大学学报(英文版)|面向属性识别和组合检索的区域感知时尚对比学习

面向属性识别和组合检索的区域感知时尚对比学习

王康平 赵鸣博

东华大学学报(英文版)2024,Vol.41Issue(4):405-415,11.
东华大学学报(英文版)2024,Vol.41Issue(4):405-415,11.DOI:10.19884/j.1672-5220.202405006

面向属性识别和组合检索的区域感知时尚对比学习

Region-Aware Fashion Contrastive Learning for Unified Attribute Recognition and Composed Retrieval

王康平 1赵鸣博1

作者信息

  • 1. 东华大学信息科学与技术学院,上海 201620
  • 折叠

摘要

Abstract

Clothing attribute recognition has become an essential technology,which enables users to automatically identify the characteristics of clothes and search for clothing images with similar attributes.However,existing methods cannot recognize newly added attributes and may fail to capture region-level visual features.To address the aforementioned issues,a region-aware fashion contrastive language-image pre-training(RaF-CLIP)model was proposed.This model aligned cropped and segmented images with category and multiple fine-grained attribute texts,achieving the matching of fashion region and corresponding texts through contrastive learning.Clothing retrieval found suitable clothing based on the user-specified clothing categories and attributes,and to further improve the accuracy of retrieval,an attribute-guided composed network(AGCN)as an additional component on RaF-CLIP was introduced,specifically designed for composed image retrieval.This task aimed to modify the reference image based on textual expressions to retrieve the expected target.By adopting a transformer-based bidirectional attention and gating mechanism,it realized the fusion and selection of image features and attribute text features.Experimental results show that the proposed model achieves a mean precision of 0.663 3 for attribute recognition tasks and a recall@10(recall@k is defined as the percentage of correct samples appearing in the top k retrieval results)of 39.18 for composed image retrieval task,satisfying user needs for freely searching for clothing through images and texts.

关键词

属性识别/图像检索/对比语言图像预训练(CLIP)/图像文本匹配/transformer

Key words

attribute recognition/image retrieval/contrastive language-image pre-training(CLIP)/image text matching/transformer

分类

信息技术与安全科学

引用本文复制引用

王康平,赵鸣博..面向属性识别和组合检索的区域感知时尚对比学习[J].东华大学学报(英文版),2024,41(4):405-415,11.

基金项目

National Natural Science Foundation of China(No.61971121) (No.61971121)

东华大学学报(英文版)

1672-5220

访问量0
|
下载量0
段落导航相关论文