首页|期刊导航|计算机与现代化|基于混合Transformer的视线估计模型

基于混合Transformer的视线估计模型

程章刘丹王艳霞

计算机与现代化Issue(4)：1-5,11,6.

计算机与现代化Issue(4)：1-5,11,6.DOI:10.3969/j.issn.1006-2475.2025.04.001

基于混合Transformer的视线估计模型

Gaze Estimation Model Based on Hybrid Transformer

程章 ¹刘丹 ¹王艳霞¹

作者信息

1. 重庆师范大学计算机与信息科学学院,重庆 401331
折叠

摘要

Abstract

Combined CNN and Transformer,Transformer can gain the advantage of global feature information and improve the awareness of model context information,which can lead to improve model accuracy.A novel gaze estimation model RN-SA(ResNet-MHSA)based on a hybrid Transformer is proposed.In this model,part of the 3×3 spatial convolution layers in ResNet18 are replaced with a block composed of a 1×1 spatial convolution layer and MHSA(Multi-Head Self-Attention)layer,and the DropBlock mechanism is added to the model structure to increase the robustness of the model.Experimental results show that RN-SA model can improve the accuracy of the model while reducing the number of parameters compared with the current better model GazeTR-Hybrid,RN-SA model can improve the accuracy by 4.1%and 3.7%on EyeDiap and Gaze360 datasets,respectively,while the number of parameters is reduced by 15.8%.Therefore,the combination of CNN and Transformer can be effectively applied to gaze estimation tasks.

关键词

视线估计/自注意力/MHSA/Transformer

Key words

gaze estimation/self-attention/MHSA/Transformer

分类

信息技术与安全科学

引用本文复制引用

程章,刘丹,王艳霞..基于混合Transformer的视线估计模型[J].计算机与现代化,2025,(4):1-5,11,6.

基金项目

重庆市科委科学研究项目(cstc2021jcyj-msxm2791) （cstc2021jcyj-msxm2791）

重庆市教委科技项目(KJZD-K202200513) （KJZD-K202200513）

计算机与现代化

ISSN：1006-2475

访问量10

下载量0

段落导航