| 注册
首页|期刊导航|红外与毫米波学报|基于知识蒸馏的轻量化遥感多模态大语言模型

基于知识蒸馏的轻量化遥感多模态大语言模型

张馨月 冯世阳 王斌

红外与毫米波学报2026,Vol.45Issue(1):103-115,13.
红外与毫米波学报2026,Vol.45Issue(1):103-115,13.DOI:10.11972/j.issn.1001-9014.2026.01.012

基于知识蒸馏的轻量化遥感多模态大语言模型

Lightweight remote sensing multimodal large language model based on knowledge distillation

张馨月 1冯世阳 1王斌1

作者信息

  • 1. 复旦大学 电磁波信息科学教育部重点实验室,上海 200433||复旦大学 信息学院 图像与智能实验室,上海 200433
  • 折叠

摘要

Abstract

Remote sensing multimodal large language models(MLLMs),which integrate rich visual-linguistic modal in‐formation,have shown great potential in areas such as remote sensing image analysis and interpretation.However,exist‐ing knowledge distillation methods primarily focus on the compression of unimodal large language models,neglecting the alignment of features across modalities,thus hindering the performance of large language models in cross-modal tasks.To address this issue,a lightweighting method for remote sensing MLLMs based on knowledge distillation is pro‐posed.This method achieves effective alignment of multimodal information by aligning the outputs across modalities at the feature level.By introducing the reverse Kullback-Leibler divergence as the loss function and combining optimiza‐tion strategies such as teacher mixed sampling and single-step decomposition,the generalization and stability of the stu‐dent model are further enhanced.Experimental results demonstrate that the proposed method achieves higher accuracy and efficiency in four downstream tasks of remote sensing image scene classification,visual question answering,visual localization,and image description,significantly reducing the number of model parameters and the demand for computa‐tional resources,thereby providing a new solution for the efficient application of MLLMs in the field of remote sensing.

关键词

遥感图像/多模态大语言模型/知识蒸馏/反向Kullback-Leibler散度/特征对齐

Key words

remote sensing images/multimodal large language models/knowledge distillation/reverse Kullback-Leibler divergence/feature alignment

分类

信息技术与安全科学

引用本文复制引用

张馨月,冯世阳,王斌..基于知识蒸馏的轻量化遥感多模态大语言模型[J].红外与毫米波学报,2026,45(1):103-115,13.

基金项目

国家自然科学基金(62371140),国家重点研发计划(2022YFB3903404) Supported by the National Natural Science Foundation of China(62371140),the National Key Researchand Development Program of China(2022YFB3903404) (62371140)

红外与毫米波学报

1001-9014

访问量0
|
下载量0
段落导航相关论文