首页|期刊导航|电子科技|基于交叉融合编码器的Transformer图像特征提取网络

基于交叉融合编码器的Transformer图像特征提取网络

龚宇吴鹏

电子科技2025，Vol.38Issue(9)：20-25,6.

电子科技2025，Vol.38Issue(9)：20-25,6.DOI:10.16180/j.cnki.issn1007-7820.2025.09.003

基于交叉融合编码器的Transformer图像特征提取网络

Cross-Fusion Encoder-Based Transformer Feature Extraction Network

龚宇 ¹吴鹏¹

作者信息

1. 浙江理工大学信息科学与工程学院,浙江杭州 310018
折叠

摘要

Abstract

In view of the problems that window-based vision Transformer is easy to destroy fine-grained fea-tures and large number of model parameters,this study proposes a cross-fusion encoder based Transformer image fea-ture extraction network.Two feature subsets are obtained using image channel feature correlation consistency stripping feature maps.Two attention modules are connected in parallel perform attention calculations respectively to obtain lo-cal and global information.A crossover mechanism is adopted to fuse information.Combined with the inter-window attention module of CAT Transformer,an in-window attention mode between channel dimensions of feature graph is designed to avoid destroying texture information and enhance the representation ability of local features.Experimental results show that the proposed model achieves 79.86％TOP-1 accuracy with 7.8 MB parameter on CIFAR-100 data set and 80.7％accuracy on ImageNet-1K data set.Grad-CAM(Gradient-weighted Class Activation Mapping)is al-so used to visualize the decision-making process.

关键词

计算机视觉/图像分类/自注意力机制/特征提取/上下文信息/编码器/通道特征/卷积神经网络

Key words

computer vision/image classification/self-attention/feature extraction/contextual information/en-coder/channel feature/convolutional neural network

分类

信息技术与安全科学

引用本文复制引用

龚宇,吴鹏..基于交叉融合编码器的Transformer图像特征提取网络[J].电子科技,2025,38(9):20-25,6.

基金项目

浙江省自然科学基金(LY21F010016)Natural Science Foundation of Zhejiang(LY21F010016) （LY21F010016）

电子科技

ISSN：1007-7820

访问量0

下载量0

段落导航