首页|期刊导航|计算机应用研究|改进视觉Transformer的视频插帧方法

改进视觉Transformer的视频插帧方法

石昌通单鸿涛郑光远张玉金刘怀远宗智浩

计算机应用研究2024，Vol.41Issue(4)：1252-1257,6.

计算机应用研究2024，Vol.41Issue(4)：1252-1257,6.DOI:10.19734/j.issn.1001-3695.2023.07.0344

改进视觉Transformer的视频插帧方法

Video frame interpolation method based on improved visual Transformer

石昌通 ¹单鸿涛 ¹郑光远 ²张玉金 ¹刘怀远 ¹宗智浩¹

作者信息

1. 上海工程技术大学电子电气工程学院,上海 201620
2. 上海建桥学院信息技术学院,上海 201306
折叠

摘要

Abstract

Aiming at the problem that the existing video frame interpolation methods cannot effectively deal with large motion and complex motion scenes,this paper proposed a video frame interpolation method based on improved vision Transformer.This method fused the cross-scale window-based attention and the separable spatio-temporal local attention,enlarged the re-ceptive field of attention,and aggregated multi-scale information.It jointly modeled the spatio-temporal dependencies and long-range pixel dependencies,thereby enhancing the model's ability to handle large motion scenes.The experimental results show that this model achieves PSNR values of 37.13 dB and 28.28 dB on the Vimeo90K test set and the DAVIS dataset,re-spectively,while the SSIM values reach 0.978 and 0.891,respectively.At the same time,visualization results show that the proposed method can produce clear and reasonable frame interpolation results for videos with large motion,complex motion and occlusion scenes.

关键词

视频插帧/Transformer/基于跨尺度窗口的注意力/大运动/复杂运动

Key words

video frame interpolation/Transformer/cross-scale window-based attention/large motion/complex motion

分类

信息技术与安全科学

引用本文复制引用

石昌通,单鸿涛,郑光远,张玉金,刘怀远,宗智浩..改进视觉Transformer的视频插帧方法[J].计算机应用研究,2024,41(4):1252-1257,6.

基金项目

国家自然科学基金资助项目(62173222) （62173222）

计算机应用研究

OA北大核心CSTPCD

ISSN：1001-3695

访问量0

下载量0

段落导航