首页|期刊导航|北京测绘|基于多模态大语言模型的遥感目标检测:以图像分类为例

基于多模态大语言模型的遥感目标检测:以图像分类为例

郭东艳马丽丽金贤咏

北京测绘2025，Vol.39Issue(8)：1091-1096,6.

北京测绘2025，Vol.39Issue(8)：1091-1096,6.DOI:10.19580/j.cnki.1007-3000.2025.08.001

基于多模态大语言模型的遥感目标检测:以图像分类为例

Remote sensing object detection based on multi-modal large language models:A case study on image classification

郭东艳 ¹马丽丽 ¹金贤咏²

作者信息

1. 61206部队,北京 100042
2. 61618部队,北京 100094
折叠

摘要

Abstract

Remote sensing image classification is an important component of remote sensing object detection.Vision-based machine learning algorithms have been effectively applied in this field,but issues such as high data acquisition costs and large computational resource requirements still exist.In recent years,artificial intelligence large language models(LLMs)have developed rapidly,with multi-modal large language models(MLLMs)demonstrating outstanding performance in both natural language processing and computer vision fields.This paper explored the application of MLLMs in the image classifi-cation domain of remote sensing object recognition,particularly verifying their effectiveness through experimental validation on a specific target dataset(airport target classification)using publicly available large models,without the need for training the model.Experimental results show that several publicly available online MLLMs can achieve classification accuracy above 80%and high batch processing speeds,without requiring any local computational resource deployment costs,high-lighting the immense potential of online multi-modal large language models in this field.

关键词

人工智能/遥感目标识别/图像分类/多模态大语言模型

Key words

artificial intelligence/remote sensing object recognition/image classification/multi-modal large language model

分类

天文与地球科学

引用本文复制引用

郭东艳,马丽丽,金贤咏..基于多模态大语言模型的遥感目标检测:以图像分类为例[J].北京测绘,2025,39(8):1091-1096,6.

基金项目

北京市自然科学基金(8222011) （8222011）

北京测绘

ISSN：1007-3000

访问量0

下载量0

段落导航