北京交通大学学报2024,Vol.48Issue(5):78-87,10.DOI:10.11860/j.issn.1673-0291.20240003
基于NNC-EPNet的多模态融合3D目标检测
Multimodal fusion 3D object detection based on NNC-EPNet
摘要
Abstract
To address the challenge of ineffectively integrating target image features in current multi-modal 3D object detection methods,this study proposes a multimodal 3D object detection method,NNC-EPNet,by introducing a Nearest Neighbor Correction(NNC)method to mitigate the impact of sparse target point cloud and non-targeted point clouds.First,the NNC module is designed by using the enhanced features of neighboring point clouds to refine the sampled point clouds.This process re-duces noise in point cloud data and strengthens the features of target point clouds,facilitating better in-tegration of target image features.Second,a Multi-Modal Fusion Transformer(MFT)encoder is de-veloped,which uses cross-attention mechanisms to fuse image and point cloud features and introduces a point cloud attention mechanism to aggregate global contextual information,thereby enhancing fea-ture representation capabilities.Finally,comparative experiments are conducted on the standard au-tonomous driving datasets,namely KITTI and Waymo.Experimental results show that NNC-EPNet achieves an average detection accuracy of 84.47%on the KITTI dataset,with improvements of 2.00%,3.25%,and 5.68%in the easy,moderate,and hard scenarios,compared to the baseline algo-rithm.On the Waymo dataset,it achieves a weighted average accuracy of 74.48%,with improve-ments of 2.49%compared to the baseline algorithm.These results prove that the two designed mod-ules,NNC and MFT,can effectively improve the 3D object detection performance.关键词
3D目标检测/多模态/特征融合/点云修正/注意力机制Key words
3D object detection/multimodality/feature fusion/point cloud correction/attention mechanism分类
信息技术与安全科学引用本文复制引用
冯霞,梁宇龙,卢敏,左海超..基于NNC-EPNet的多模态融合3D目标检测[J].北京交通大学学报,2024,48(5):78-87,10.基金项目
国家自然科学基金(U2333206) National Natural Science Foundation of China(U2333206) (U2333206)