| 注册
首页|期刊导航|计算机应用研究|改进视觉Transformer的视频插帧方法

改进视觉Transformer的视频插帧方法

石昌通 单鸿涛 郑光远 张玉金 刘怀远 宗智浩

计算机应用研究2024,Vol.41Issue(4):1252-1257,6.
计算机应用研究2024,Vol.41Issue(4):1252-1257,6.DOI:10.19734/j.issn.1001-3695.2023.07.0344

改进视觉Transformer的视频插帧方法

Video frame interpolation method based on improved visual Transformer

石昌通 1单鸿涛 1郑光远 2张玉金 1刘怀远 1宗智浩1

作者信息

  • 1. 上海工程技术大学电子电气工程学院,上海 201620
  • 2. 上海建桥学院信息技术学院,上海 201306
  • 折叠

摘要

Abstract

Aiming at the problem that the existing video frame interpolation methods cannot effectively deal with large motion and complex motion scenes,this paper proposed a video frame interpolation method based on improved vision Transformer.This method fused the cross-scale window-based attention and the separable spatio-temporal local attention,enlarged the re-ceptive field of attention,and aggregated multi-scale information.It jointly modeled the spatio-temporal dependencies and long-range pixel dependencies,thereby enhancing the model's ability to handle large motion scenes.The experimental results show that this model achieves PSNR values of 37.13 dB and 28.28 dB on the Vimeo90K test set and the DAVIS dataset,re-spectively,while the SSIM values reach 0.978 and 0.891,respectively.At the same time,visualization results show that the proposed method can produce clear and reasonable frame interpolation results for videos with large motion,complex motion and occlusion scenes.

关键词

视频插帧/Transformer/基于跨尺度窗口的注意力/大运动/复杂运动

Key words

video frame interpolation/Transformer/cross-scale window-based attention/large motion/complex motion

分类

信息技术与安全科学

引用本文复制引用

石昌通,单鸿涛,郑光远,张玉金,刘怀远,宗智浩..改进视觉Transformer的视频插帧方法[J].计算机应用研究,2024,41(4):1252-1257,6.

基金项目

国家自然科学基金资助项目(62173222) (62173222)

计算机应用研究

OA北大核心CSTPCD

1001-3695

访问量0
|
下载量0
段落导航相关论文