重庆工商大学学报(自然科学版)2024,Vol.41Issue(4):104-111,8.DOI:10.16055/j.issn.1672-058X.2024.0004.013
基于多尺度特征混合注意力的连续帧深度估计
Continuous Frame Depth Estimation Based on Multi-scale Feature Mixed Attention Mechanism
摘要
Abstract
Objective Estimating the depth information to obtain the distance between the photographed object and the camera is the method to obtain the depth information in monocular vision SLAM.As unsupervised monocular depth estimation algorithms suffer from insufficient accuracy as well as large errors,a continuous frame depth estimation network based on a hybrid attention mechanism with multi-scale feature fusion was proposed.Methods Information on depth and 6 degrees of freedom of pose were obtained by two encoder-decoder structures for depth estimation and pose estimation,respectively.The depth information and the pose information were used for image reconstruction with the original image loss calculation to output the depth information.The decoder encoder structure for depth estimation formed a U-shaped network,and the same encoder was used for both the pose estimation network and the depth estimation network,and the pose information was output through the pose estimation decoder.The feature maps at four different scales were extracted in the encoder using a hybrid attention mechanism CBAM network combined with a ResNet network.For the enhancement of the estimated depth information contour details,the extracted features of each different scale were then assigned learnable weight coefficients to extract local and global features and then fused with the original features.Results Evaluation of error and accuracy was performed on the KITTI dataset,and finally,testing was also performed.Compared with the classical monodepth2 monocular method,the relative error,root mean square error,and log root mean square error in the error evaluation metrics were reduced by 0.034,0.129,and 0.002,respectively,and self-made test images demonstrated the generalizability of the network.Conclusion The multiscale features are extracted using a ResNet network combined with a hybrid attention mechanism,while multiscale feature fusion on the extracted features enhances the depth estimation and improves the contour details.关键词
单目视觉/连续帧深度估计/混合注意力机制/多尺度特征融合Key words
monocular vision/continuous frame depth estimation/hybrid attention mechanism/multiscale feature fusion分类
信息技术与安全科学引用本文复制引用
郑宇航,曹雏清..基于多尺度特征混合注意力的连续帧深度估计[J].重庆工商大学学报(自然科学版),2024,41(4):104-111,8.基金项目
国家自然科学基金面上项目(62073101) (62073101)
高校优秀青年人才支持计划项目(019YQQ023) (019YQQ023)
安徽省教育厅科学研究重点项目(KJ2020A0364) (KJ2020A0364)
国家重点研发计划"智能机器人"重点专项(2018YFB1308900). (2018YFB1308900)