郑州大学学报(工学版)2024,Vol.45Issue(1):70-77,121,9.DOI:10.13705/j.issn.1671-6833.2024.01.017
基于门控时空注意力的视频帧预测模型
Video Frame Prediction Model Based on Gated Spatio-Temporal Attention
摘要
Abstract
A video frame prediction model based on gated spatio-temporal attention was proposed to address the is-sues of low accuracy,slow training,complex structure,and error accumulation in recurrent video frame prediction architectures.Firstly,high-level semantic information of the video frame sequence was extracted by a spatial en-coder while preserving background features.Secondly,a gated spatio-temporal attention mechanism was estab-lished,utilizing multi-scale deep bar convolutions and channel attention to learn both intra-frame and inter-frame spatio-temporal features.A gate fusion mechanism was employed to balance the feature learning capability of spatio-temporal attention.Finally,a spatial decoder reconstructed the high-level features into predicted realistic images and complements background semantics to enhance the details.Experimental results on the Moving MNIST,Taxi-BJ,WeatherBench,and KITTI datasets showed that compared to the multi-input multi-output model SimVP,the mean squared error(MSE)was reduced by 14.7%,6.7%,10.5%,and 18.5%,respectively.In ablation and expansion experiments,the proposed model achieved good overall performance,demonstrating advantages such as high prediction accuracy,low computational complexity,and efficient inference.关键词
视频帧预测/卷积神经网络/注意力机制/门控卷积/编解码网络Key words
video frame prediction/convolutional neural network/attention mechanism/gated convolution/codec network分类
信息技术与安全科学引用本文复制引用
李卫军,张新勇,高庾潇,顾建来,刘锦彤..基于门控时空注意力的视频帧预测模型[J].郑州大学学报(工学版),2024,45(1):70-77,121,9.基金项目
中央高校基本科研业务费专项资金(2021JCYJ12) (2021JCYJ12)
国家自然科学基金资助项目(61962001) (61962001)
宁夏自然科学基金资助项目(2021AAC03215) (2021AAC03215)
北方民族大学研究生创新项目(YCX23147) (YCX23147)