液晶与显示2025,Vol.40Issue(4):598-606,9.DOI:10.37188/CJLCD.2024-0218
基于多模态数据的注意特征融合姿态估计网络
Pose estimation network based on attention feature fusion of multimodal data
摘要
Abstract
6D pose estimation that balances accuracy and applicability has been a hot and difficult research topic.To this end,a 6D pose estimation network based on attentional feature fusion of multimodal data is proposed.Firstly,a deeper structure of squeeze and excitation module is introduced to enhance the dependency to expand the receptive field by adjusting the weights of each channel to improve the effect of processing RGB image features.Further,for multimodal data,an iterative attention feature fusion module is deployed in the feature fusion stage,which solves the scale inconsistency problem in global feature fusion through multiple iterative fusion operations,and is able to capture and integrate multimodal data more accurately,which significantly improves the effect of attitude regression.Finally,in order to quantitatively assess the robustness and applicability of the model in complex environments,an invisibility percentage metric is introduced,which is capable of assessing the performance of the model when dealing with partially occluded or complex backgrounds.Through the pose prediction experiments on the public dataset,it is verified that the improved model is not only able to achieve accurate predicted poses on the validation dataset,but also the algorithmic model proposed in this paper is more applicable in complex environments compared to the densefusion model.关键词
6D姿态估计/多模态数据/注意特征融合/不可见百分比Key words
6D pose estimation/multimodal data/attention feature fusion/occlusion percentage分类
计算机与自动化引用本文复制引用
赵云涛,邓新辉..基于多模态数据的注意特征融合姿态估计网络[J].液晶与显示,2025,40(4):598-606,9.基金项目
湖北省工程研究中心开放课题(No.IDICP-KF-2024-08)Supported by Hubei Provincial Engineering Research Center(No.IDICP-KF-2024-08) (No.IDICP-KF-2024-08)