摘要
Abstract
Existing models are affected by factors such as an imbalance of background and road sample proportion and lim-ited ability to extract local or global scale features,which easily lead to problems such as missing roads and misextraction of background when extracting road structures in remote sensing images.To address these issues,a road extraction model from remote sensing images based on twin encoder structures was constructed.In the twin encoder,one branch built a convolu-tional encoder using dynamic snake-shaped convolution units,while the other branch built a transformer encoder with deformable attention transformers.The variability operator was used to capture the elongated structural features of the road.At the same time,a multi-scale convolutional feature decoder was constructed,and the features were re-filtered and precisely upsampled through the multi-scale group convolution module and the efficient upsampling module.In the training stage,the unified focus function was used to calculate the pixel classification loss and overall similarity loss to suppress the problem brought by the imbalanced background and road sample proportion.Experimental results show that the F1 score of this model on the large-scale satellite image datasets of representative cities in China(chn6-cug)and the deep global road dataset(deep-globe road)reaches 91.24%and 90.76%,respectively,which are 8.81%and 9.98%higher than those of the third-generation U-shaped network(UNet3+)model.The extracted road structure is more complete,and it outperforms the cur-rent mainstream models in terms of extraction accuracy and generalization.关键词
遥感影像道路提取/孪生编码器/动态蛇形卷积/可变形注意力转换器/多尺度卷积解码/统一焦点函数Key words
road extraction from remote sensing image/twin encoder/dynamic snake-shaped convolution/deformable attention transformer/multi-scale convolution decoding/unified focus function分类
天文与地球科学