首页|期刊导航|数据采集与处理|MonoDI:基于融合深度实例的单目3D目标检测

MonoDI:基于融合深度实例的单目3D目标检测

赵科董浩然业宁

数据采集与处理2025，Vol.40Issue(5)：1322-1332,11.

数据采集与处理2025，Vol.40Issue(5)：1322-1332,11.DOI:10.16337/j.1004-9037.2025.05.017

MonoDI:基于融合深度实例的单目3D目标检测

MonoDI:Monocular 3D Object Detection Based on Fusing Depth Instances

赵科 ¹董浩然 ¹业宁¹

作者信息

1. 南京林业大学信息科学与技术学院、人工智能学院,南京 210037
折叠

摘要

Abstract

Monocular 3D object detection aims to locate the 3D bounding boxes of objects in a single 2D input image,which is an extremely challenging task in the absence of image depth information.To address the issues of poor detection performance due to the absence of depth information during inference on 2D images and background noise interference in depth maps,this paper proposes a monocular 3D object detection method called MonoDI,which integrates depth instances.The key idea is to utilize depth information generated by an effective depth estimation network and combine it with instance segmentation masks to obtain depth instances,and then integrate the depth instances with 2D image information to aid in regressing 3D object information.To better use the depth instance information,this paper designs an iterative depth aware attention fusion module(iDAAFM),integrating depth instance feature with 2D image feature to obtain a feature representation with clear object boundaries and depth information.Subsequently,a residual convolutional structure is introduced during training and inference to replace the general single convolutional structure to ensure stability and efficiency of the network when processing fused information.Further,we design a 3D bounding box uncertainty auxiliary task to assist the main task in learning the generation of bounding boxes in training and improving the accuracy of monocular 3D object detection.Finally,the effectiveness of the method is validated on the KITTI dataset and experimental results show that the proposed method improves 3D object detection accuracy for the vehicle class at the moderate difficulty level by 4.41 percentage points compared with the baseline,and outperforms comparative methods such as MonoCon and MonoLSS.And it also achieves superior results on the KITTI-nuScenes cross-dataset evaluation.

关键词

单目3D目标检测/实例分割/特征融合/残差卷积/辅助学习

Key words

monocular 3D object detection/instance segmentation/feature fusion/residual convolution/auxiliary learning

分类

信息技术与安全科学

引用本文复制引用

赵科,董浩然,业宁..MonoDI:基于融合深度实例的单目3D目标检测[J].数据采集与处理,2025,40(5):1322-1332,11.

基金项目

国家重点研发计划资助项目(2016YFD600101). （2016YFD600101）

数据采集与处理

OA北大核心

ISSN：1004-9037

访问量0

下载量0

段落导航