| 注册
首页|期刊导航|上海航天(中英文)|基于原型增强与时序误差学习的红外可见光融合视频目标分割方法

基于原型增强与时序误差学习的红外可见光融合视频目标分割方法

王瀚增 张睿恒 徐立新 徐晓峰 刘雨蒙

上海航天(中英文)2026,Vol.43Issue(2):120-128,9.
上海航天(中英文)2026,Vol.43Issue(2):120-128,9.DOI:10.19328/j.cnki.2096-8655.2026.02.012

基于原型增强与时序误差学习的红外可见光融合视频目标分割方法

RGB-thermal Fused Video Object Segmentation Method Based on Prototype Enhancement and Temporal Error Learning

王瀚增 1张睿恒 1徐立新 1徐晓峰 2刘雨蒙3

作者信息

  • 1. 北京理工大学 机电学院,北京 100081
  • 2. 安徽工程大学 计算机与信息学院,安徽 芜湖 241000
  • 3. 中国科学院软件研究所,北京 100190
  • 折叠

摘要

Abstract

RGB-thermal fused video object segmentation(RGB-T VOS)can effectively address the segmentation challenges in complex environments by combining the fine-grained texture information from visible images with the target saliency cues provided by thermal imagery.However,most existing RGB-T VOS fusion paradigms adopt an open-loop strategy,in which the fusion decisions for the current frame rely solely on the instantaneous feature inputs,while ignoring the temporal continuity and the model ability of the contribution variations across different modalities over time.To address this limitation,in this paper,a closed-loop RGB-T fusion framework based on prototype enhancement and temporal error learning is proposed,which is built upon an explicit attention-enhanced fusion architecture.Specifically,a temporal prototype enhancement mechanism is introduced to leverage the target feature statistics from the previous frame,thereby strengthening the feature representation of the target regions in the current frame.In addition,a learnable error encoder is introduced to compensate for the prediction deviations arising during the fusion process,enabling the adaptive correction of channel-wise gating and spatial attention decisions.The experimental results show that the proposed method achieves the average values of the Jaccard index(J)and F-score(F)of 87.7 and 88.6 on the VisT300 and VT-UAV datasets,respectively,outperforming existing methods.These results validate the effectiveness of the proposed closed-loop fusion strategy in improving temporal stability and segmentation accuracy.

关键词

视频目标分割/红外可见光融合/误差反馈/深度学习/分割一切大模型2(SAM2)

Key words

video object segmentation/RGB-thermal fusion/error feedback/deep learning/segment anything model 2(SAM2)

分类

信息技术与安全科学

引用本文复制引用

王瀚增,张睿恒,徐立新,徐晓峰,刘雨蒙..基于原型增强与时序误差学习的红外可见光融合视频目标分割方法[J].上海航天(中英文),2026,43(2):120-128,9.

基金项目

国家自然科学基金资助项目(62475016,62402481,62406004) (62475016,62402481,62406004)

北京市自然科学基金资助项目(L252142) (L252142)

上海航天(中英文)

2096-8655

访问量0
|
下载量0
段落导航相关论文