电子学报2025,Vol.53Issue(10):3744-3758,15.DOI:10.12263/DZXB.20250483
基于跨视图上下文感知的高分辨率遥感图像半监督语义分割方法
Cross-View Context-Aware Semi-Supervised Semantic Segmentation for High-Resolution Remote Sensing Images
摘要
Abstract
Semi-supervised semantic segmentation of high-resolution remote sensing images aims to leverage a small number of labeled samples together with a large amount of unlabeled data for joint training,thereby enhancing the perfor-mance of semantic segmentation models,as this approach not only significantly reduces the cost of manual annotation but also fully exploits the potential value of unlabeled data.Existing methods typically divide high-resolution remote sensing images into multiple sub-views for training,focusing primarily on enforcing prediction consistency under different perturba-tions of the same view.However,such strategies often overlook the semantic and spatial relationships between different views,limiting the model's ability to learn broader contextual information when labeled data are scarce.To address this is-sue,this paper proposes a cross-view context-aware semi-supervised semantic segmentation method for high-resolution re-mote sensing images.The proposed approach explicitly models the contextual interactions among multiple views to im-prove the quality of pseudo labels and introduces a multi-level cross-view consistency constraint to maintain prediction con-sistency within a broader contextual scope.Specifically,during training,multiple overlapping views—including a primary view and several contextual views—are sampled from the original high-resolution image and jointly fed into the model.A spatial-aware interaction fusion(SIF)module is designed to perform cross-view feature interaction and fusion via cross-at-tention and self-attention mechanisms.This module generates spatial attention activation maps that adaptively fuse the pre-dictions from different views,thereby improving pseudo label accuracy.In addition,a multiple cross-view context consisten-cy(CVCC)mechanism is introduced to enforce consistent predictions in overlapping regions by aligning their spatial corre-spondences.This constraint enhances the model's ability to perceive and model cross-view contextual information,mitigat-ing semantic ambiguity caused by view variations.To comprehensively evaluate the proposed method,extensive experi-ments are conducted on the Vaihingen and Potsdam datasets provided by the International Society for Photogrammetry and Remote Sensing,under various labeling annotation ratios.Results show that the proposed method consistently outperforms state-of-the-art semi-supervised segmentation approaches.In particular,under an extremely low-label setting using only one labeled image,it achieves 6.84%and 12.73%mIoU improvements over the supervised baseline on Vaihingen and Potsdam,respectively,validating its superior performance and strong generalization under limited annotation.关键词
遥感/语义分割/半监督学习/跨视图上下文一致性/空间感知交互融合/伪标签Key words
remote sensing/semantic segmentation/semi-supervised learning/cross-view context consistency/spa-tial-aware interaction fusion/pseudo-label分类
信息技术与安全科学引用本文复制引用
吕亮,兰杰,兰猛,卢宪凯,张乐飞..基于跨视图上下文感知的高分辨率遥感图像半监督语义分割方法[J].电子学报,2025,53(10):3744-3758,15.基金项目
国家自然科学基金(No.62431020) National Natural Science Foundation of China(No.62431020) (No.62431020)