计算机技术与发展2024,Vol.34Issue(10):38-45,8.DOI:10.20165/j.cnki.ISSN1673-629X.2024.0183
基于多尺度融合和高阶交互的单目3D检测算法
Monocular 3D Detection Algorithm Based on Multi-scale Fusion and High-order Interaction
摘要
Abstract
Due 3D target detection is a basic and challenging task in 3D scene understanding.The method based on monocular vision can be used as an economic alternative based on stereo or radar methods.An improved monocular 3D detection algorithm based on MonoDLE is proposed to optimize the accuracy loss caused by the deviation between size and shape and 3D position.Firstly,a general multi-scale pooled attention module is proposed,which is used to aggregate more fine multi-scale features and efficient context information.Secondly,in order to enhance the high-order spatial interaction ability of the model,a recursive gated convolution block composed of recursive gated convolution and GN regularization is proposed to replace the convolution layer of the sampling module on the baseline architecture and effectively improve the representation ability of the up-sampling module.The experimental results on the monocular 3D detection general data set KITTI show that after the ability of network aggregation is improved by multi-scale pooled attention module,the average detection rate index AP40 of the proposed algorithm is improved from 13.66 to 15.10 under the standard condition of 3D viewing angle and intersection-merge ratio greater than 0.70;after the recursive gated convolutional blocks enhance the high-order spatial interaction ability of the model,the average detection rate index AP40 of the proposed algorithm is increased from 15.10 to 15.53 again in the standard case of 3D viewing angle and intersection-union ratio greater than 0.7;under the synergistic action of the two modules,the average detection rate index AP40 of the proposed algorithm is also improved from 19.33 to 21.95 in the case of aerial view and the intersection ratio is greater than 0.70.关键词
单目3D检测/特征金字塔池化/注意力机制/递归门控卷积/分组归一化Key words
monocular 3D detection/characteristic pyramid pooling/attention mechanism/recursive gated convolution/grouping normali-zation分类
信息技术与安全科学引用本文复制引用
孙延康,王璇之,封澳,谢玉阳,肖建..基于多尺度融合和高阶交互的单目3D检测算法[J].计算机技术与发展,2024,34(10):38-45,8.基金项目
国家自然科学基金项目(61974073) (61974073)