| 注册
首页|期刊导航|光学精密工程|基于多尺度注意力机制TransUNet的双目视觉定位与测量方法

基于多尺度注意力机制TransUNet的双目视觉定位与测量方法

杨玉 许四祥 张梦权 吴端正

光学精密工程2025,Vol.33Issue(16):2502-2515,14.
光学精密工程2025,Vol.33Issue(16):2502-2515,14.DOI:10.37188/OPE.20253316.2502

基于多尺度注意力机制TransUNet的双目视觉定位与测量方法

Binocular vision localization and measurement method based on TransUNet with multi-scale attention mechanism

杨玉 1许四祥 1张梦权 1吴端正1

作者信息

  • 1. 安徽工业大学 机械工程学院,安徽 马鞍山 243032
  • 折叠

摘要

Abstract

Aiming at the problems of low detection efficiency of traditional binocular-vision feature-detec-tion algorithms,as well as the insufficient attention to globally important features and the excessive pa-rameter count of most network models,a method of continuous-casting-billet localization and measure-ment based on TransUNet binocular vision with a multiscale attention mechanism was proposed.First-ly,left and right images of continuous-casting billets were collected with a calibrated parallel binocular camera to build a dataset.Subsequently,taking TransUNet as the backbone,an improved Transformer layer was introduced to extract global context information;a Global Spatial Group Attention(GSGA)module was appended after every decoder block to enhance focus on globally salient features through a grouped multiscale attention mechanism;and a Convolutional Block Attention Module(CBAM)was in-serted after each encoder-decoder skip connection and bilinear interpolation to boost key-point recogni-tion by combining spatial and channel attention.Finally,3-D coordinate reconstruction and distance measurement were performed on the network's key-point coordinates by leveraging binocular-vision principles.The experimental results show that compared with the Transformer model,the root-mean-square error and normalized error are reduced by 33.8%and 36.83%,the number of parameters and floating-point operations are reduced by 10.58%and 8.21%,and the single-batch inference time is shortened by 32.30%.In 3D ranging,the relative error of measurement reaches 0.137%,which is sig-nificantly better than the traditional feature detection algorithm and meets the binocular vision localiza-tion and measurement requirements.

关键词

双目视觉/TransUNet/关键点检测/注意力机制

Key words

binocular vision/TransUNet/keypoints detection/attention mechanism

分类

信息技术与安全科学

引用本文复制引用

杨玉,许四祥,张梦权,吴端正..基于多尺度注意力机制TransUNet的双目视觉定位与测量方法[J].光学精密工程,2025,33(16):2502-2515,14.

基金项目

国家自然科学基金资助项目(No.51374007) (No.51374007)

安徽高校自然科学研究重点项目(No.KJ2020A0259) (No.KJ2020A0259)

光学精密工程

OA北大核心

1004-924X

访问量0
|
下载量0
段落导航相关论文