首页|期刊导航|计算机工程与应用|细粒度图像分类上Vision Transformer的发展综述

细粒度图像分类上Vision Transformer的发展综述

孙露露刘建平王健邢嘉璐张越王晨阳

计算机工程与应用2024，Vol.60Issue(10)：30-46,17.

计算机工程与应用2024，Vol.60Issue(10)：30-46,17.DOI:10.3778/j.issn.1002-8331.2310-0395

细粒度图像分类上Vision Transformer的发展综述

Survey of Vision Transformer in Fine-Grained Image Classification

孙露露 ¹刘建平 ²王健 ³邢嘉璐 ¹张越 ¹王晨阳¹

作者信息

1. 北方民族大学计算机科学与工程学院,银川 750021
2. 北方民族大学计算机科学与工程学院,银川 750021||北方民族大学图像图形智能处理国家民委重点实验室,银川 750021
3. 中国农业科学院农业信息研究所,北京 100081
折叠

摘要

Abstract

Fine-grained image classification(FGIC)has always been an important problem in computer vision.Compared to traditional image classification tasks,FGIC faces the challenge of extremely similar inter-class objects,which further increases the difficulty of the task.With the development of deep learning,Vision Transformer(ViT)models have become popular in the field of vision and have been introduced into FGIC tasks.This paper introduces the challenges faced by FGIC tasks,provides an overview of the ViT model,and analyzes its characteristics.The comprehensive review is primarily based on the model structure and covers FGIC algorithms based on ViT.It includes feature extraction,feature relation modeling,feature attention,and feature enhancement as the main aspects.Each algorithm is summarized,and its advantages and disadvantages are analyzed.Following that,a comparison of the performance of different ViT models on the same public dataset is conducted to validate their effectiveness in the FGIC tasks.Furthermore,the limitations of current research are pointed out,and future research directions are proposed to further explore the potential of ViT in FGIC.

关键词

细粒度图像分类/Vision Transformer/特征提取/特征关系构建/特征注意/特征增强

Key words

fine-grained image classification/Vision Transformer/feature extraction/feature relation modeling/feature attention/feature enhancement

分类

信息技术与安全科学

引用本文复制引用

孙露露,刘建平,王健,邢嘉璐,张越,王晨阳..细粒度图像分类上Vision Transformer的发展综述[J].计算机工程与应用,2024,60(10):30-46,17.

基金项目

宁夏重点研发计划(引才专项)(2022BSB03044) （引才专项）

宁夏自然科学基金(2021AAC03205) （2021AAC03205）

北方民族大学科研启动金项目(2020KYQD37) （2020KYQD37）

北方民族大学研究生创新项目(YCX23168). （YCX23168）

计算机工程与应用

OA北大核心CSTPCD

ISSN：1002-8331

访问量0

下载量0

段落导航