基于CB-ViT的青少年视线估计算法研究OA北大核心CSTPCD
Research on adolescent gaze estimation algorithm based on CB-ViT
视线估计技术在人机交互、虚拟现实和医学辅助诊断等领域有着广泛应用.然而,现有的公开数据集主要针对成年人,导致基于这些数据集训练的视线估计算法在应用于青少年群体时效果通常不尽如人意.为了解决这一问题,收集了一个名为"Young-Gaze"的青少年视线数据集,涵盖了107位青少年的视线数据.还提出了一种2D视线估计算法,该算法基于ViT并引入了一个名为上下文广播的模块,同时通过融合左眼和右眼的不同层次特征,显著增强了网络模型在特征表达上的能力.在实验中,该算法在Young-Gaze数据集上展现了出色的性能,达到了5.42 cm的误差,性能优于当前其他同类2D视线估计算法.除了在Young-Gaze数据集上取得显著性能外,该算法同样在公开的2D视线估计数据集如GazeCapture和MPIIFaceGaze上进行了训练和测试,也展现了良好的性能,表明该算法不仅适用于青少年群体,也能够在成人群体中得到有效应用.
Gaze estimation technology is widely applied in the fields such as human-computer interaction(HCI),virtual reality,and medical diagnostic assistance.However,the existing public datasets are primarily adult-oriented,so the gaze estimation algorithms trained on these datasets show suboptimal performance when applied to adolescents.To address this issue,a youth-specific gaze dataset named ″Young-Gaze″,which encompasses gaze data from 107 adolescents,is collected.In addition,a novel 2D gaze estimation algorithm is proposed.This algorithm is on the basis of ViT(vision transformer)and incorporates a context broadcasting(CB)module,which significantly enhances the feature representation capability of the network model by integrating both eyes' features at different levels.Experimentally,this algorithm demonstrates superior performance on the dataset Young-Gaze.Its error is kept within 5.42 cm,so it surpasses the other existing 2D gaze estimation methods.Besides its notable performance on Young-Gaze,it also shows good results when trained and tested on the public 2D gaze datasets GazeCapture and MPIIFaceGaze.The above facts indicate that the proposed algorithm is not only suitable for the adolescent,but also applicable for the adults effectively.
严青松;毛建华;刘志;陆小锋
上海大学 通信与信息工程学院,上海 200444上海大学 通信与信息工程学院,上海 200444||上海大学温州研究院,浙江 温州 325000
电子信息工程
视线估计头部姿态CNN特征融合ViT上下文广播
gaze estimationhead postureCNNfeature fusionViTCB
《现代电子技术》 2024 (015)
146-150 / 5
温州市重大科技创新攻关项目(ZY2023003)
评论