计算机与数字工程2025,Vol.53Issue(2):395-402,8.DOI:10.3969/j.issn.1672-9722.2025.02.017
基于离散余弦变换特征融合的无监督视频目标分割
Unsupervised Video Object Segmentation Based on Discrete Cosine Transform Feature Fusion
摘要
Abstract
Unsupervised Video Object Segmentation(UVOS)aims to localize and segment foreground objects in videos with-out manually providing the ground-truth object segmentation mask for the first frame.Existing methods mainly focus on improving segmentation accuracy while ignoring memory and computational cost.Generally,the existing methods only enhance the fusion fea-tures of appearance and motion in the spatial domain according to their significance,ignoring the particularity of features in the fre-quency domain.In addition,the existing methods do not make full use of global semantic information to guide video object segmenta-tion.To solve the problems above,this paper proposes a lightweight UVOS network based on discrete cosine transform feature fu-sion.Firstly,a lightweight backbone network is used to extract appearance and motion features simultaneously.Secondly,the dis-crete cosine transform feature fusion module is designed to fuse and enhance the appearance and motion features.Then,the large kernel convolution global semantic guidance module is used to integrate the large kernel volume,which can reduce the computation-al complexity of large kernel convolution and keep the ability of extracting global semantic information.Finally,under the guidance of global semantic information,the multi-level features enhanced in frequency domain are aggregated progressively,and finally the accurate segmentation results are obtained.Through the aforementioned designs,the presented method has only 14.7 M parameters.A large number of experimental evaluations are conducted on DAVIS2016,FBMS and DAVSOD datasets,showing that the method achieves favorable performance on J&F,MAE and Fm also keeps high reasoning speed.关键词
无监督视频目标分割/离散余弦变换/注意力机制/频域分析Key words
unsupervised video object segmentation/discrete cosine transform/attention mechanism/frequency do-main analysis分类
信息技术与安全科学引用本文复制引用
王玉琛,樊佳庆,宋慧慧..基于离散余弦变换特征融合的无监督视频目标分割[J].计算机与数字工程,2025,53(2):395-402,8.基金项目
国家自然科学基金项目(编号:61532009) (编号:61532009)
江苏省自然科学基金项目(编号:BK20191397) (编号:BK20191397)
江苏省研究生实践创新计划(编号:sjcx22_0355)资助. (编号:sjcx22_0355)