| 注册
首页|期刊导航|计算机工程与应用|基于注视转移学习的视频注视目标检测

基于注视转移学习的视频注视目标检测

杨兴明 史俊彪 李自强 吴克伟 谢昭

计算机工程与应用2024,Vol.60Issue(20):293-301,9.
计算机工程与应用2024,Vol.60Issue(20):293-301,9.DOI:10.3778/j.issn.1002-8331.2306-0206

基于注视转移学习的视频注视目标检测

Learning Gaze Transition for Gaze Target Detection in Video

杨兴明 1史俊彪 2李自强 2吴克伟 1谢昭1

作者信息

  • 1. 大数据知识工程教育部重点实验室(合肥工业大学),合肥 230601||合肥工业大学 计算机与信息学院,合肥 230601
  • 2. 合肥工业大学 计算机与信息学院,合肥 230601
  • 折叠

摘要

Abstract

Gaze target detection in the video aims to localize the gaze target in each video frame.The person gazes at dif-ferent targets at different times.In the transition segment from one gaze target to gaze at another,the person may not gaze at a specific target.The gaze target detection method with an image transformer neglects to consider the temporal transi-tion segment.The gaze direction in the transition segment may hinder the gaze target detection in the video.For gaze tar-get detection in video,this paper proposes a gaze transition-based model,which contains a gaze direction guidance mod-ule,and a gaze transition temporal fusion module.In the gaze direction guidance module,the position of the gaze target is used to learn the heatmap of the gaze direction.The gaze target is detected by guiding with the heatmap of the gaze direc-tion,which can suppress the target out of the gaze direction and predict the accurate position of the gaze target.In the gaze transition temporal fusion module,the heatmap in multiple frames forms the spatial-temporal heatmap.To learn the changes in the spatial-temporal heatmap,this paper uses bi-directional spatial-temporal convolution long short-term memory(LSTM),which can extract the memory-based spatial-temporal heatmap.The gaze transition is described by introducing the Gaussian-based temporal model.To localize the temporal segment of the gaze transition with uncertainty temporal length,this paper designs a Gaussian-based temporal fusion method,which can estimate the gaze transition with the start timestamp,the end timestamp,and the temporal length.By localizing the gaze transition segment,the transition effect can be removed for gaze target detection.Gaze transition-based model is trained with gaze direction-based loss,gaze target existence loss,gaze target heatmap loss,and gaze transition temporal localization loss.In the GazeFollow dataset and VideoAttentionTarget dataset,the experimental results show that the gaze transition-based model outperforms the image transformer-based model for gaze target detection in video.

关键词

注视目标检测/注视转移/注视目标热图/时空卷积长短期记忆网络/高斯时间融合

Key words

gaze target detection/gaze transition/gaze target heatmap/spatial-temporal convolution long short-term memory/Gaussian-based temporal fusion

分类

信息技术与安全科学

引用本文复制引用

杨兴明,史俊彪,李自强,吴克伟,谢昭..基于注视转移学习的视频注视目标检测[J].计算机工程与应用,2024,60(20):293-301,9.

基金项目

国家重点研发计划(2017YFB1002203) (2017YFB1002203)

安徽省自然科学基金(JZ2021AKZR0351). (JZ2021AKZR0351)

计算机工程与应用

OA北大核心CSTPCD

1002-8331

访问量0
|
下载量0
段落导航相关论文