雷达学报2026,Vol.15Issue(2):409-440,32.DOI:10.12000/JR25256
基于视觉语言模型的SAR图像目标解译综述
SAR Image Target Interpretation Based on Vision-language Model:A Survey
摘要
Abstract
Synthetic Aperture Radar(SAR)is widely used in military and civilian applications,with intelligent target interpretation of SAR images being a crucial component of SAR applications.Vision-Language Models(VLMs)play an important role in SAR target interpretation.By incorporating natural language understanding,VLMs effectively address the challenges posed by large intraclass variability in target characteristics and the scarcity of high-quality labeled samples,thereby advancing the field from purely visual interpretation toward semantic understanding of targets.Drawing upon our team's extensive research experience in SAR target interpretation theory,algorithms,and applications,this paper provides a comprehensive review of intelligent SAR target interpretation based on VLMs.We provide an in-depth analysis of existing challenges and tasks,summarize the current state of research,and compile available open-source datasets.Furthermore,we systematically outline the evolution,ranging from task-specific VLMs to contrastive-,conversational-,and generative-based VLMs and foundational models.Finally,we discuss the latest challenges and future outlooks in SAR target interpretation by VLMs.关键词
合成孔径雷达/目标智能解译/视觉语言模型/人工智能/基础模型Key words
Synthetic Aperture Radar(SAR)/Target intelligent interpretation/Vision-Language Model(VLM)/Artificial Intelligence(AI)/Foundation model分类
天文与地球科学引用本文复制引用
王君宇,孙浩,黄启灏,计科峰,匡纲要..基于视觉语言模型的SAR图像目标解译综述[J].雷达学报,2026,15(2):409-440,32.基金项目
国家自然科学基金联合基金(U24B20189)The Joint Funds of the National Natural Science Foundation of China(U24B20189) (U24B20189)