首页|期刊导航|计算机技术与发展|卷积增强Vision Mamba模型的构建及其应用

卷积增强Vision Mamba模型的构建及其应用

俞焕友范静黄凡

计算机技术与发展2025，Vol.35Issue(8)：45-52,8.

计算机技术与发展2025，Vol.35Issue(8)：45-52,8.DOI:10.20165/j.cnki.ISSN1673-629X.2025.0070

卷积增强Vision Mamba模型的构建及其应用

Construction of Convolutional Vision Mamba Model and Its Application

俞焕友 ¹范静 ²黄凡³

作者信息

1. 上海第二工业大学计算机与信息工程学院,上海 201209
2. 上海第二工业大学数理与统计学院,上海 201209
3. 华为技术有限公司云核心网研究部,江苏南京 210012
折叠

摘要

Abstract

An improved model-Convolutional Vision Mamba(CvM)is proposed to address the issues of the Vision Mamba(Vim)model.In order to achieve more efficient processing of global visual information,the CvM model uses convolutional operations to replace the graphic segmentation and position encoding mechanisms in Vim.And Vim's drawbacks are optimized including high computational cost and memory consumption in the position embedding module.Moreover,the CvM model is applied to the field of medical image classification,integrating with multiple image datasets such as blood cell images,brain tumor images,chest CT scans,pathological myopia images,and pneumonia X-ray images.The experimental results show that compared with the Vim model and five other neural network models,the CvM model not only performs well in accuracy,but also exhibits significant improvements in memory usage and parameter number.Finally,in the ablation study,the depthwise separable convolution performs better than the standard convolution in reducing the number of parameters and memory usage,and significantly improves the accuracy of classification on blood cell images and brain tumor images.These results fully demonstrate the feasibility and significant superiority of the CvM model.

关键词

深度学习/Vision Mamba/卷积神经网络/深度可分离卷积/医学图像分类

Key words

deep learning/Vision Mamba/convolutional neural network/depth-wise separable convolution/medical image classification

分类

信息技术与安全科学

引用本文复制引用

俞焕友,范静,黄凡..卷积增强Vision Mamba模型的构建及其应用[J].计算机技术与发展,2025,35(8):45-52,8.

基金项目

国家自然科学基金资助项目(11601316) （11601316）

计算机技术与发展

ISSN：1673-629X

访问量2

下载量0

段落导航