| 注册
首页|期刊导航|电子学报|无人机视角多源目标检测数据集UAV-RGBT及算法基准

无人机视角多源目标检测数据集UAV-RGBT及算法基准

汪进中 戴顺 张秀伟 田雪涛 邢颖慧 汪芳 尹翰林 张艳宁

电子学报2025,Vol.53Issue(3):686-704,19.
电子学报2025,Vol.53Issue(3):686-704,19.DOI:10.12263/DZXB.20240602

无人机视角多源目标检测数据集UAV-RGBT及算法基准

UAV-RGBT Multispectral Object Detection Dataset and Algorithm Benchmark

汪进中 1戴顺 1张秀伟 1田雪涛 2邢颖慧 1汪芳 1尹翰林 3张艳宁1

作者信息

  • 1. 西北工业大学计算机学院,陕西 西安 710072
  • 2. 西北工业大学计算机学院,陕西 西安 710072||西安爱生技术集团有限公司,陕西 西安 710065
  • 3. 西北工业大学计算机学院,陕西 西安 710072||西北工业大学深圳研究院,广东 深圳 518063
  • 折叠

摘要

Abstract

Unmanned aerial vehicle(UAV)-based multispectral object detection utilizing both visible(RGB)and ther-mal infrared(T)images,makes all-weather and all-day target monitoring possible,serving critical roles in military and civil-ian applications.However,due to the complexity of data acquisition and processing,there is currently a lack of publicly available UAV-based RGB-T multispectral object detection datasets,which to some extent limits its research and applica-tion.Meanwhile,UAV operational scenarios are characterized by complex and variable conditions,including rapid changes in flight altitude,speed,focal length,and background.So,the captured targets exhibit diverse scales,uneven(dense/sparse)distributions,and category imbalances in images,which presents significant challenges for accurate detection.Furthermore,real-time requirement should be guaranted in applications such as reconnaissance and traffic monitoring.Therefore,it is the key to keep a trade-off between accuracy and speed in the algorithmic design of UAV RGB-T object detector.To address these issues,this paper introduces a large-scale UAV-based RGB-T multispectral dataset named UAV-RGBT,which spans across seasons and day-night cycles,and includes multiple categories and scales.Specifically,UAV-RGBT comprises 20 categories with 5 117 pairs of RGB-T images and over 110 000 annotations,which is conducive to advancing research in UAV-based multispectral object detection algorithms.Moreover,based on the YOLOv8n model,the UAV-based dual-branch multispectral object detection(UAV-DMDet)model is proposed to promote deep fusion of multispectral features through a multi-modal cross-attention fusion module and a multi-modal feature decomposition combination module.This approach achieves a batter trade-off among model parameter size,detection speed,and accuracy.Experimental results dem-onstrate that the UAV-DMDet model improves the mAP@0.5 on the UAV-RGBT dataset by 3.61%and 11.03%in the visi-ble and thermal modalities,respectively,and enhances the mAP@0.5:0.95 by 0.84%and 6.76%,respectively.On the Drone-Vehicle dataset,the UAV-DMDet model outperforms the mainstream algorithm I2MDet,with mAP@0.5 and mAP@0.5:0.95 improvements of 2.66%and 12.36%,respectively.Furthermore,with 640' 640 resolution images as input,the UAV-DMDet model achieve FP32 precision inference speed of 31 frames per second on a GeForce RTX 3090 GPU,and FP16 precision inference speed of 58 frames per second on a Huawei Ascend 710 processor,making it effectively applicable for real-time UAV-based RGB-T multispectral object detection tasks.

关键词

无人机(UAV)/可见光-热红外(RGB-T)多源目标检测/数据集/多源特征融合/YOLOv8

Key words

unmanned aerial vehicle(UAV)/visible and thermal infrared multispectral object detection/dataset/multi-modal feature fusion/YOLOv8

分类

信息技术与安全科学

引用本文复制引用

汪进中,戴顺,张秀伟,田雪涛,邢颖慧,汪芳,尹翰林,张艳宁..无人机视角多源目标检测数据集UAV-RGBT及算法基准[J].电子学报,2025,53(3):686-704,19.

基金项目

国家自然科学基金(No.61971356) (No.61971356)

陕西省自然科学基础研究计划(No.2024JC-DXWT-07,No.2024JC-YBQN-0719) (No.2024JC-DXWT-07,No.2024JC-YBQN-0719)

陕西省重点研发计划(No.2023-YBGY-012) (No.2023-YBGY-012)

广东省基础与应用基础研究基金(No.2024A1515030186) National Natural Science Foundation of China(No.61971356) (No.2024A1515030186)

Natural Science Basic Research Program of Shaanxi Province(No.2024JC-DXWT-07,No.2024JC-YBQN-0719) (No.2024JC-DXWT-07,No.2024JC-YBQN-0719)

Key Research and Development Program of Shaanxi Province(No.2023-YBGY-012) (No.2023-YBGY-012)

Basic and Applied Basic Research Foundation of Guangdong Province(No.2024A1515030186) (No.2024A1515030186)

电子学报

OA北大核心

0372-2112

访问量0
|
下载量0
段落导航相关论文