|国家科技期刊平台
首页|期刊导航|郑州大学学报(工学版)|基于门控时空注意力的视频帧预测模型

基于门控时空注意力的视频帧预测模型OACSTPCD

Video Frame Prediction Model Based on Gated Spatio-Temporal Attention

中文摘要英文摘要

针对循环式视频帧预测架构存在精度低、训练缓慢,以及结构复杂和误差累积等问题,提出了一种基于门控时空注意力的视频帧预测模型.首先,通过空间编码器提取视频帧序列的高级语义信息,同时保留背景特征;其次,建立门控时空注意力机制,采用多尺度深度条形卷积和通道注意力来学习帧内及帧间的时空特征,并利用门控融合机制平衡时空注意力的特征学习能力;最后,由空间解码器将高级特征解码为预测的真实图像,并补充背景语义以完善细节.在Moving MNIST、TaxiBJ、WeatherBench、KITTI数据集上的实验结果显示,同多进多出模型SimVP相比,MSE分别降低了 14.7%、6.7%、10.5%、18.5%,在消融扩展实验中,所提模型达到了较好的综合性能,具有预测精度高、计算量低和推理效率高等优势.

A video frame prediction model based on gated spatio-temporal attention was proposed to address the is-sues of low accuracy,slow training,complex structure,and error accumulation in recurrent video frame prediction architectures.Firstly,high-level semantic information of the video frame sequence was extracted by a spatial en-coder while preserving background features.Secondly,a gated spatio-temporal attention mechanism was estab-lished,utilizing multi-scale deep bar convolutions and channel attention to learn both intra-frame and inter-frame spatio-temporal features.A gate fusion mechanism was employed to balance the feature learning capability of spatio-temporal attention.Finally,a spatial decoder reconstructed the high-level features into predicted realistic images and complements background semantics to enhance the details.Experimental results on the Moving MNIST,Taxi-BJ,WeatherBench,and KITTI datasets showed that compared to the multi-input multi-output model SimVP,the mean squared error(MSE)was reduced by 14.7%,6.7%,10.5%,and 18.5%,respectively.In ablation and expansion experiments,the proposed model achieved good overall performance,demonstrating advantages such as high prediction accuracy,low computational complexity,and efficient inference.

李卫军;张新勇;高庾潇;顾建来;刘锦彤

北方民族大学计算机科学与工程学院,宁夏银川 750021||北方民族大学图像图形智能处理国家民委重点实验室,宁夏银川 750021北方民族大学计算机科学与工程学院,宁夏银川 750021

计算机与自动化

视频帧预测卷积神经网络注意力机制门控卷积编解码网络

video frame predictionconvolutional neural networkattention mechanismgated convolutioncodec network

《郑州大学学报(工学版)》 2024 (001)

基于决策过程的广义可能性时序逻辑的符号模型检测方法研究

70-77,121 / 9

中央高校基本科研业务费专项资金(2021JCYJ12);国家自然科学基金资助项目(61962001);宁夏自然科学基金资助项目(2021AAC03215);北方民族大学研究生创新项目(YCX23147)

10.13705/j.issn.1671-6833.2024.01.017

评论