| 注册
首页|期刊导航|计算机工程与应用|基于注意力机制和深度先验的注视点检测网络

基于注意力机制和深度先验的注视点检测网络

朱芸 朱冬晨 张广慧 孙彦赞 张晓林

计算机工程与应用2024,Vol.60Issue(14):240-249,10.
计算机工程与应用2024,Vol.60Issue(14):240-249,10.DOI:10.3778/j.issn.1002-8331.2305-0022

基于注意力机制和深度先验的注视点检测网络

Gaze Target Detection Network Based on Attention Mechanism and Depth Prior

朱芸 1朱冬晨 2张广慧 3孙彦赞 4张晓林5

作者信息

  • 1. 上海大学通信与信息工程学院,上海 200444||中国科学院上海微系统与信息技术研究所仿生视觉系统实验室,上海 200050
  • 2. 中国科学院上海微系统与信息技术研究所仿生视觉系统实验室,上海 200050||中国科学技术大学,合肥 230026
  • 3. 中国科学院上海微系统与信息技术研究所仿生视觉系统实验室,上海 200050
  • 4. 上海大学通信与信息工程学院,上海 200444
  • 5. 中国科学院上海微系统与信息技术研究所仿生视觉系统实验室,上海 200050||中国科学技术大学,合肥 230026||上海科技大学,上海 201210||中国科学院 雄安创新研究院,河北 雄安 071702
  • 折叠

摘要

Abstract

Human gaze behavior,as a non-verbal cue,plays a crucial role in revealing human intentions.Gaze target detec-tion has attracted extensive attention from the machine vision community.However,existing gaze target detection methods usually focus on the texture information extraction of images,ignoring the importance of stereo depth information for gaze target detection,which makes it difficult to deal with scenes with complex texture.In this work,a novel gaze target detection network based on attention mechanism and depth prior is proposed,which adopts two-stage architecture(i.e.,a gaze direction prediction stage and a saliency detection stage).In the gaze direction predication stage,a channel-spatial attention mechanism module is established to recalibrate texture features,and a head position encoding branch is designed to achieve texture and head position-aware enhanced high-representation features to accurately predict gaze.Furthermore,a strategy is proposed to introduce the depth representing the stereoscopic or distance information in the 3D scene as a prior into the saliency detection stage.At the same time,the channel-spatial attention mechanism is used to enhance the multi-scale texture features,and the advantages of depth geometric information and image texture information are fully utilized to improve the accuracy of gaze target detection.Experimental results show that the proposed model performs favorably against the state-of-the-art methods on GazeFollow and DLGaze datasets.

关键词

注视点检测/注意力机制/深度先验/特征融合/神经网络

Key words

gaze target detection/attention mechanism/depth prior/feature aggregation/neural network

分类

信息技术与安全科学

引用本文复制引用

朱芸,朱冬晨,张广慧,孙彦赞,张晓林..基于注意力机制和深度先验的注视点检测网络[J].计算机工程与应用,2024,60(14):240-249,10.

基金项目

上海市"脑与类脑智能基础转化应用研究"市级重大科技专项(2018SHZDZX01). (2018SHZDZX01)

计算机工程与应用

OA北大核心CSTPCD

1002-8331

访问量9
|
下载量0
段落导航相关论文