首页|期刊导航|计算机与数字工程|基于离散余弦变换特征融合的无监督视频目标分割

基于离散余弦变换特征融合的无监督视频目标分割

王玉琛樊佳庆宋慧慧

计算机与数字工程2025，Vol.53Issue(2)：395-402,8.

计算机与数字工程2025，Vol.53Issue(2)：395-402,8.DOI:10.3969/j.issn.1672-9722.2025.02.017

基于离散余弦变换特征融合的无监督视频目标分割

Unsupervised Video Object Segmentation Based on Discrete Cosine Transform Feature Fusion

王玉琛 ¹樊佳庆 ²宋慧慧¹

作者信息

1. 南京信息工程大学自动化学院南京 210044
2. 南京航空航天大学计算机科学与技术学院南京 211106
折叠

摘要

Abstract

Unsupervised Video Object Segmentation(UVOS)aims to localize and segment foreground objects in videos with-out manually providing the ground-truth object segmentation mask for the first frame.Existing methods mainly focus on improving segmentation accuracy while ignoring memory and computational cost.Generally,the existing methods only enhance the fusion fea-tures of appearance and motion in the spatial domain according to their significance,ignoring the particularity of features in the fre-quency domain.In addition,the existing methods do not make full use of global semantic information to guide video object segmenta-tion.To solve the problems above,this paper proposes a lightweight UVOS network based on discrete cosine transform feature fu-sion.Firstly,a lightweight backbone network is used to extract appearance and motion features simultaneously.Secondly,the dis-crete cosine transform feature fusion module is designed to fuse and enhance the appearance and motion features.Then,the large kernel convolution global semantic guidance module is used to integrate the large kernel volume,which can reduce the computation-al complexity of large kernel convolution and keep the ability of extracting global semantic information.Finally,under the guidance of global semantic information,the multi-level features enhanced in frequency domain are aggregated progressively,and finally the accurate segmentation results are obtained.Through the aforementioned designs,the presented method has only 14.7 M parameters.A large number of experimental evaluations are conducted on DAVIS2016,FBMS and DAVSOD datasets,showing that the method achieves favorable performance on J&F,MAE and Fm also keeps high reasoning speed.

关键词

无监督视频目标分割/离散余弦变换/注意力机制/频域分析

Key words

unsupervised video object segmentation/discrete cosine transform/attention mechanism/frequency do-main analysis

分类

信息技术与安全科学

引用本文复制引用

王玉琛,樊佳庆,宋慧慧..基于离散余弦变换特征融合的无监督视频目标分割[J].计算机与数字工程,2025,53(2):395-402,8.

基金项目

国家自然科学基金项目(编号:61532009) （编号:61532009）

江苏省自然科学基金项目(编号:BK20191397) （编号:BK20191397）

江苏省研究生实践创新计划(编号:sjcx22_0355)资助. （编号:sjcx22_0355）

计算机与数字工程

ISSN：1672-9722

访问量4

下载量0

段落导航