首页|期刊导航|陕西科技大学学报|基于改进的Vision Transformer深度哈希图像检索

基于改进的Vision Transformer深度哈希图像检索

杨梦雅赵琰薛亮

陕西科技大学学报2025，Vol.43Issue(4)：183-191,9.

基于改进的Vision Transformer深度哈希图像检索

Deep hashing method based on improved Vision Transformer

杨梦雅 ¹赵琰 ¹薛亮¹

作者信息

1. 上海电力大学电子与信息工程学院,上海 201306
折叠

摘要

Abstract

To solve the problem that the deep hashing method based on convolutional neural network cannot well capture the global image information and the imbalance of difficult and easy samples,positive and negative sample pairs in the datasets,this paper proposed an im-proved deep hashing method based on Vision Transformer called CMTH.Firstly,CMTH uti-lized the convolutional neural networks to extract deep local features before the Transformer encoder network,reduce dimensionality,and keep image resolution.Secondly,the improved Vision Transformer network used a lightweight Multi-head self-attention module to extract high-dimensional deep global features while reducing computational complexity.Finally,a new loss framework is proposed to design a normalized focal loss to adjust the weight of hard samples and to construct a new hash loss to reduce the impact of imbalance between easy and hard samples,as well as the imbalance between positive and negative samples.Compared to the deep hashing suboptimal algorithm based on Vision Transformer,the mean Average Pre-cision on CIFAR-10 and NUS-WIDE improved by an average of 2.35％and 3.75％,respec-tively,across four different bit settings.

关键词

深度哈希/卷积神经网络/视觉注意力/图像检索

Key words

deep hashing/convolutional neural network/Vision Transformer/image retrieval

分类

信息技术与安全科学

引用本文复制引用

杨梦雅,赵琰,薛亮..基于改进的Vision Transformer深度哈希图像检索[J].陕西科技大学学报,2025,43(4):183-191,9.

基金项目

国家自然科学基金项目(62105196) （62105196）

陕西科技大学学报

OA北大核心

ISSN：2096-398X

访问量0

下载量0

段落导航