首页|期刊导航|上海航天（中英文）|基于原型增强与时序误差学习的红外可见光融合视频目标分割方法

基于原型增强与时序误差学习的红外可见光融合视频目标分割方法

王瀚增张睿恒徐立新徐晓峰刘雨蒙

上海航天（中英文）2026，Vol.43Issue(2)：120-128,9.

上海航天（中英文）2026，Vol.43Issue(2)：120-128,9.DOI:10.19328/j.cnki.2096-8655.2026.02.012

基于原型增强与时序误差学习的红外可见光融合视频目标分割方法

RGB-thermal Fused Video Object Segmentation Method Based on Prototype Enhancement and Temporal Error Learning

王瀚增 ¹张睿恒 ¹徐立新 ¹徐晓峰 ²刘雨蒙³

作者信息

1. 北京理工大学机电学院,北京 100081
2. 安徽工程大学计算机与信息学院,安徽芜湖 241000
3. 中国科学院软件研究所,北京 100190
折叠

摘要

Abstract

RGB-thermal fused video object segmentation(RGB-T VOS)can effectively address the segmentation challenges in complex environments by combining the fine-grained texture information from visible images with the target saliency cues provided by thermal imagery.However,most existing RGB-T VOS fusion paradigms adopt an open-loop strategy,in which the fusion decisions for the current frame rely solely on the instantaneous feature inputs,while ignoring the temporal continuity and the model ability of the contribution variations across different modalities over time.To address this limitation,in this paper,a closed-loop RGB-T fusion framework based on prototype enhancement and temporal error learning is proposed,which is built upon an explicit attention-enhanced fusion architecture.Specifically,a temporal prototype enhancement mechanism is introduced to leverage the target feature statistics from the previous frame,thereby strengthening the feature representation of the target regions in the current frame.In addition,a learnable error encoder is introduced to compensate for the prediction deviations arising during the fusion process,enabling the adaptive correction of channel-wise gating and spatial attention decisions.The experimental results show that the proposed method achieves the average values of the Jaccard index(J)and F-score(F)of 87.7 and 88.6 on the VisT300 and VT-UAV datasets,respectively,outperforming existing methods.These results validate the effectiveness of the proposed closed-loop fusion strategy in improving temporal stability and segmentation accuracy.

关键词

视频目标分割/红外可见光融合/误差反馈/深度学习/分割一切大模型2(SAM2)

Key words

video object segmentation/RGB-thermal fusion/error feedback/deep learning/segment anything model 2(SAM2)

分类

信息技术与安全科学

引用本文复制引用

王瀚增,张睿恒,徐立新,徐晓峰,刘雨蒙..基于原型增强与时序误差学习的红外可见光融合视频目标分割方法[J].上海航天（中英文）,2026,43(2):120-128,9.

基金项目

国家自然科学基金资助项目(62475016,62402481,62406004) （62475016,62402481,62406004）

北京市自然科学基金资助项目(L252142) （L252142）

上海航天（中英文）

ISSN：2096-8655

访问量0

下载量0

段落导航