北京测绘2025,Vol.39Issue(8):1091-1096,6.DOI:10.19580/j.cnki.1007-3000.2025.08.001
基于多模态大语言模型的遥感目标检测:以图像分类为例
Remote sensing object detection based on multi-modal large language models:A case study on image classification
摘要
Abstract
Remote sensing image classification is an important component of remote sensing object detection.Vision-based machine learning algorithms have been effectively applied in this field,but issues such as high data acquisition costs and large computational resource requirements still exist.In recent years,artificial intelligence large language models(LLMs)have developed rapidly,with multi-modal large language models(MLLMs)demonstrating outstanding performance in both natural language processing and computer vision fields.This paper explored the application of MLLMs in the image classifi-cation domain of remote sensing object recognition,particularly verifying their effectiveness through experimental validation on a specific target dataset(airport target classification)using publicly available large models,without the need for training the model.Experimental results show that several publicly available online MLLMs can achieve classification accuracy above 80%and high batch processing speeds,without requiring any local computational resource deployment costs,high-lighting the immense potential of online multi-modal large language models in this field.关键词
人工智能/遥感目标识别/图像分类/多模态大语言模型Key words
artificial intelligence/remote sensing object recognition/image classification/multi-modal large language model分类
天文与地球科学引用本文复制引用
郭东艳,马丽丽,金贤咏..基于多模态大语言模型的遥感目标检测:以图像分类为例[J].北京测绘,2025,39(8):1091-1096,6.基金项目
北京市自然科学基金(8222011) (8222011)