| 注册
首页|期刊导航|广西科学院学报|CAMN:基于跨模态属性匹配对齐的文本检索行人模型

CAMN:基于跨模态属性匹配对齐的文本检索行人模型

覃晓 卢虹妃 吴琨生

广西科学院学报2025,Vol.41Issue(1):1-11,11.
广西科学院学报2025,Vol.41Issue(1):1-11,11.DOI:10.13657/j.cnki.gxkxyxb.20250429.001

CAMN:基于跨模态属性匹配对齐的文本检索行人模型

CAMN:A Pedestrian Model for Text Retrieval Based on Cross-modal Attribute Matching Alignment

覃晓 1卢虹妃 1吴琨生2

作者信息

  • 1. 南宁师范大学,广西人机交互与智能决策重点实验室,广西 南宁 530100
  • 2. 广西民族大学物理与电子信息学院,广西 南宁 530006
  • 折叠

摘要

Abstract

Existing pedestrian retrieval models have made good progress in the global feature alignment be-tween image and text.However,it still has shortcomings in capturing pedestrian details and deeply mining the internal dependencies between image and text.In view of the above problems,firstly,this paper designs a new image feature extraction network(Multi-head self-attention network,MHANet)to obtain more detailed global image features.Secondly,in order to improve the problem of insufficient correlation between image and text local attribute features,this paper proposes a cross-modal attribute attention(ACA)module,which aims to strengthen the local attribute feature expression of the image under the guidance of text information.Final-ly,combined with the MHANet and ACA modules,this paper proposes a text retrieval pedestrian model based on Cross-modal Attribute Matching Alignment Network(CAMN),which optimizes the text-to-image retrieval effect by accurately aligning global and local attribute features.The experimental results show that compared with the visual-textual attributes alignment in person search by natural language(ViTAA)net-work,the Rank-5 of CAMN on the three public datasets of CUHK-PEDES,ICFG-PEDES,and RSTPReid is increased by 8.33 percentage point,9.30 percentage point,and 9.73 percentage point,respectively.Compared with other algorithms,it has obvious advantages,which indicates that CAMN has the ability to align image and text attributes,and overcomes the limitations of traditional image feature extraction methods.

关键词

行人检索/跨模态/注意力机制/属性对齐

Key words

pedestrian retrieval/cross-modality/attention mechanism/attribute alignment

分类

计算机与自动化

引用本文复制引用

覃晓,卢虹妃,吴琨生..CAMN:基于跨模态属性匹配对齐的文本检索行人模型[J].广西科学院学报,2025,41(1):1-11,11.

基金项目

科技部科技创新2030-"脑科学与类脑研究"重大项目(2021ZD0201904)和广西科技重大专项(桂科AA22068057)资助. (2021ZD0201904)

广西科学院学报

1002-7378

访问量0
|
下载量0
段落导航相关论文