首页|期刊导航|信号处理|RGB-D双模态互引导的自监督视觉里程计

RGB-D双模态互引导的自监督视觉里程计

史宝坤章宏亮田子安马伟米庆毋立芳

信号处理2026，Vol.42Issue(3)：398-408,11.

信号处理2026，Vol.42Issue(3)：398-408,11.DOI:10.12466/xhcl.2026.03.009

RGB-D双模态互引导的自监督视觉里程计

Self-Supervised Visual Odometry with RGB-D Bimodal Mutual Guidance

史宝坤 ¹章宏亮 ¹田子安 ¹马伟 ¹米庆 ¹毋立芳²

作者信息

1. 北京工业大学计算机学院,北京 100124
2. 北京工业大学信息科学技术学院,北京 100124
折叠

摘要

Abstract

Visual odometry,which estimates camera poses from image sequences,plays a vital role in robotic naviga-tion,autonomous driving,and augmented reality.Self-supervised visual odometry has become a research focus for its independence from ground-truth pose data.It optimizes pose and depth estimation by constructing a self-supervised loss based on geometric consistency across views.A key challenge in this framework is how to design network architectures that fully exploit the complementary pose-related cues from both RGB images and depth maps.Existing methods often overlook the heterogeneous characteristics and complementary value of the two modalities,leading to insufficient cue utilization and limited pose estimation accuracy.To address this issue,this paper proposes a self-supervised visual odom-etry method with RGB-D bimodal mutual guidance,named BMG-VO.Specifically,an RGB-guided depth detail en-hancement module is designed to incorporate texture and color priors from RGB images into the shallow layers of the depth encoding branch.This enhances the ability of depth features to capture fine details,such as edges and textures,thereby improving the robustness of feature matching.Meanwhile,a depth-guided RGB semantic enhancement module is introduced to reinforce the high-level features of the RGB encoding branch with geometric structure and intra-class consistency cues derived from depth maps.This increases robustness against illumination variations and provides more reliable matching features for pose regression.Additionally,a unimodal filtering module is employed to highlight the most essential pose-related cues within each individual modality.Extensive experiments on the KITTI dataset demon-strate that BMG-VO achieves higher accuracy in pose estimation compared to state-of-the-art self-supervised methods while also attaining excellent depth estimation performance.

关键词

视觉里程计/自监督学习/RGB-D双模态互引导/同步定位与地图构建

Key words

visual odometry/self-supervised learning/RGB-D bimodal guidance/simultaneous localization and mapping

分类

信息技术与安全科学

引用本文复制引用

史宝坤,章宏亮,田子安,马伟,米庆,毋立芳..RGB-D双模态互引导的自监督视觉里程计[J].信号处理,2026,42(3):398-408,11.

基金项目

北京市自然科学基金(4252029) （4252029）

国家自然科学基金(62576017,62176010) Beijing Natural Science Foundation(4252029) （62576017,62176010）

The National Natural Science Foundation of China(62576017,62176010) （62576017,62176010）

信号处理

ISSN：1003-0530

访问量0

下载量0

段落导航