首页|期刊导航|电子学报|面向肺部肿瘤分类的跨模态Light-3Dformer模型

面向肺部肿瘤分类的跨模态Light-3Dformer模型

周涛牛玉霞叶鑫宇刘隆陆惠玲

电子学报2025，Vol.53Issue(3)：951-961,11.

电子学报2025，Vol.53Issue(3)：951-961,11.DOI:10.12263/DZXB.20240642

面向肺部肿瘤分类的跨模态Light-3Dformer模型

Cross-Modal Light-3Dformer Model for Lung Tumor Classification

周涛 ¹牛玉霞 ¹叶鑫宇 ¹刘隆 ¹陆惠玲²

作者信息

1. 北方民族大学计算机科学与工程学院,宁夏银川 750021||北方民族大学图像图形智能处理国家民委重点实验室,宁夏银川 750021
2. 宁夏医科大学医学信息与工程学院,宁夏银川 750004
折叠

摘要

Abstract

Recognition of 3D multimodal positron emission tomography/computed tomography(PET/CT)lung tumor using deep learning is an important research area.In medical images of lung tumors,the spatial shape of lesions is irregular and the boundary between the lesions and the surrounding tissues is blurred,which makes it difficult for the model to fully extract tumor features,and the computational complexity of the model is higher in three-dimensional tasks.To solve the above problems,a cross-modal Light-3Dformer 3D lung tumor recognition model is proposed in this paper.The main contri-butions of this paper are as follows.Firstly,the backbone network extracts PET/CT image features,and the auxiliary net-work extracts PET image features and CT image features.Multi-modal feature enhancement and interactive learning are re-alized by lightweight cross-modal collaborative attention.Secondly,Light-3Dformer module are designed.In this module,Updating the 2 times matrix multiplication operation of Transformer to the linear element multiplication operation of Light-former;The cascade Lightformer structure is designed,the output feature map of the cascade Lightformer structure and the initial input feature map are fused,through parallel and deep and shallow feature fusion,lightweight and rich gradient infor-mation can be realized;Designing with parameter less attention,this structure can enhance the ability of lung tumor feature extraction from three aspects:channel,space,and tomography image.Thirdly,lightweight cross-modal collaborative atten-tion module(LCCAM)is designed,which can fully learn the cross-modal advantage information of 3D multi-modal images and carry out interactive learning of deep and shallow features.Finally,ablation experiments and comparative experiments.In the self-built 3D multi-modal data set of lung tumor,the accuracy and area under the curve(AUC)values of the model are 90.19%and 89.81%,respectively,under the premise of optimal computation and running time.Comparing with the 3D-SwinTransformer-S model,the computation quantity is reduced by 117 times,and the calculation quantity is reduced by 400 times.The experimental results show that the model can better extract multi-modal information of lung tumor lesions,which provides a new idea for lightweight and multi-modal interaction of deep learning 3D models.

关键词

肺部肿瘤/多模态图像/Transformer/Light-3Dformer/轻量化跨模态协同注意力

Key words

lung tumor/multimodal images/Transformer/Light-3Dformer/light cross-modal collaborative attention

分类

计算机与自动化

引用本文复制引用

周涛,牛玉霞,叶鑫宇,刘隆,陆惠玲..面向肺部肿瘤分类的跨模态Light-3Dformer模型[J].电子学报,2025,53(3):951-961,11.

基金项目

国家自然科学基金(No.62062003) （No.62062003）

宁夏自然科学基金(No.2023AAC03293) National Natural Science Foundation of China(No.62062003) （No.2023AAC03293）

Natural Science Foundation of Ningxia Province(No.2023AAC03293) （No.2023AAC03293）

电子学报

OA北大核心

ISSN：0372-2112

访问量0

下载量0

段落导航