测试技术学报2024,Vol.38Issue(4):413-419,7.DOI:10.3969/j.issn.1671-7449.2024053
基于Swin Transformer和双层路由注意力的多标签图像分类算法
Multi-Label Image Classification Algorithm Based on Transformer
张震 1王贺 1宋宏旭1
作者信息
- 1. 山西大学 物理电子工程学院,山西 太原 030006
- 折叠
摘要
Abstract
Image classification is a basic and important direction in image processing.Since there is not only a single label value on an image,the current image classification can no longer meet people's needs,and multi-label image classification came into being.This paper proposes a multi-label image classification framework using Swin Transformer for feature extraction and a two-layer routing attention module for fea-ture processing.Swin Transformer extracts multi-scale information through a hierarchical structure,and is superior to Vision Transformer in terms of multi-target and finer-grained image recognition.The dual-layer routing attention module enables more flexible computation allocation and content awareness.The dynamic attention mechanism adaptively adjusts the attention weight according to the characteristics of the input image,so that different positions or features can be given different levels of attention,and the inten-sity and range of attention can be flexibly controlled by adjusting the dynamic attention.The average preci-sion of the model on the COCO dataset is 87.3,and the average precision on the VOC2007 dataset is 96.7,which improves the accuracy of multi-label image classification to a certain extent.关键词
深度学习/多标签分类/Swin Transformer/双层路由注意力模块Key words
deep learning/multi-label image classification/swin transformer/bi-level routing attention分类
信息技术与安全科学引用本文复制引用
张震,王贺,宋宏旭..基于Swin Transformer和双层路由注意力的多标签图像分类算法[J].测试技术学报,2024,38(4):413-419,7.