火力与指挥控制2026,Vol.51Issue(2):54-60,69,8.DOI:10.3969/j.issn.1002-0640.2026.02.008
一种无人机航拍图像多尺度目标语义分割方法
A Multi-scale Target Semantic Segmentation Method for UAV Aerial Images
摘要
Abstract
To address the limitations of small target segmentation accuracy and cross-scale feature alignment in multi-scale target semantic segmentation tasks for UAV aerial images,this paper proposes the TransSeg model.The model integrates the global modeling capability of Swin Transformer with the symmetric encoder-decoder architecture of U-Net,and introduces D-ASPP and residual attention mechanisms.The loss function employs a collaborative optimization strategy combining Dice Loss and Focal Loss.Experiments were conducted on the UAVid dataset,with comparisons made against mainstream models including DANet,BiSeNet,and SegFormer,validating the effectiveness of TransSeg in segmenting 8 types of urban scene targets,and demonstrating strong performance on a custom dataset.The results show that the model achieves an average IoU of 66.9%,an improvement of 0.9%over SegFormer,with IoU increases of 4.1%and 5.6%for small targets such as moving vehicles and pedestrians,respectively.The research highlights the critical role of global context modeling and dynamic multi-scale fusion mechanisms in enhancing segmentation performance,providing a new solution for UAV remote sensing image processing.关键词
无人机航拍图像/语义分割/多尺度目标/Swin Transformer/D-ASPP/残差注意力机制Key words
UAV aerial images/semantic segmentation/multi-scale targets/Swin Transformer/D-ASPP/residual attention mechanism分类
航空航天引用本文复制引用
张蔚..一种无人机航拍图像多尺度目标语义分割方法[J].火力与指挥控制,2026,51(2):54-60,69,8.基金项目
河南省教育厅资助项目(ZJC18020,ZJC17098) (ZJC18020,ZJC17098)