首页|期刊导航|重庆理工大学学报|面向3D目标检测的多模态生成式图像数据增强的研究

面向3D目标检测的多模态生成式图像数据增强的研究

张光钱周广利黄飞刘文兵向阳开

重庆理工大学学报2024，Vol.38Issue(19)：13-20,8.

重庆理工大学学报2024，Vol.38Issue(19)：13-20,8.DOI:10.3969/j.issn.1674-8425(z).2024.10.002

面向3D目标检测的多模态生成式图像数据增强的研究

A multimodal generative image data enhancement for 3D object detection

张光钱 ¹周广利 ²黄飞 ²刘文兵 ²向阳开¹

作者信息

1. 重庆交通大学机电与车辆工程学院,重庆 400074
2. 中国路桥工程有限责任公司,北京 100010
折叠

摘要

Abstract

The traditional generative image data augmentation algorithms usually lose 3D attribute information,rendering them unsuitable for 3D object detection in autonomous driving.To address the problem,we propose a multimodal image enhancement algorithm based on stable diffusion model.A data augmentation method specifically designed for 3D object detection is developed employing our proposed algorithm.It further constrains the image generation process by introducing more modal inputs.In addition,it has devised a multimodal feature online generation module to extract real-time information such as scene descriptions,semantic distributions,and depth features.Meanwhile,for the multimodal feature fusion network,an enhanced gating self-attention module is designed to accurately capture depth information in the latent feature space.This effectively preserves the 3D attribute information of the image,facilitating targeted modifications to 2D features like texture,color,and illumination.Leveraging the algorithm's exceptional depth-preserving characteristics,the new images are combined with 3D pseudo-labels to create novel image samples,thereby achieving data augmentation for image samples.The 3D detection results on the nuScenes public dataset demonstrate the effectiveness of our algorithm in preserving 3D attributes,particularly for larger categories such as buses and trucks.The AP values exhibit noticeable improvement of 17.2％and 14.1％respectively.Additionally,the indicator of mAP and DNS is increased by 6.8％and 3.4％respectively.

关键词

数据增强/稳定扩撒/图像生成/目标检测/特征融合

Key words

data enhancement/stable diffusion/image generation/object detection/feature fusion

分类

信息技术与安全科学

引用本文复制引用

张光钱,周广利,黄飞,刘文兵,向阳开..面向3D目标检测的多模态生成式图像数据增强的研究[J].重庆理工大学学报,2024,38(19):13-20,8.

基金项目

重庆市科技创新重大研发项目(CSTB2022TIAD-STX0003) （CSTB2022TIAD-STX0003）

重庆理工大学学报

OA北大核心

ISSN：1674-8425

访问量0

下载量0

段落导航