计算机工程2025,Vol.51Issue(2):126-138,13.DOI:10.19678/j.issn.1000-3428.0069013
掩模特征融合:实例分割新范式
Mask Feature Fusion:New Paradigm of Instance Segmentation
摘要
Abstract
Instance segmentation is a fundamental task in understanding visual scenes.Existing algorithms exhibit certain similarities and differences.By analyzing these similarities and differences,this paper proposes a novel instance segmentation paradigm called Mask Feature Fusion(MFF).This paradigm divides the instance segmentation task into three modules:extraction of semantically independent mask features,extraction of semantically related sequences,and fusion of sequence features with mask features.Building on the structural characteristics of MFF,two optimizations are proposed.First,by designing a non-local global bias,the focus of the backbone network on global information is enhanced.This allows the mask feature extraction module to access global information at shallow network levels and mitigates dataset inherent biases introduced by pretrained weights.Second,during experiment,instability in the query vectors is observed in some Transformer models during the early training stages.Specifically,the Regions of Interest(ROIs)for most query vectors shift after each cross-attention operation.To address this issue,a denoising training method is introduced for the sequence extraction module.This method ensures that the attention of the query vectors remains focused on the same area in the early stages of training,thereby accelerating the convergence of the Transformer decoder and enhancing model precision under identical parameter configurations.Experimental results conclusively demonstrate the effectiveness of these improvements.Specifically,in the instance segmentation task on the MS-COCO2017 dataset,compared with the foundational model of MFF paradigm,after adding new improvement measures,the model exhibits a notable increase of 5.0%in the mask mean Average Precision(mAP)metric.关键词
实例分割范式/掩模特征融合/非局部全局偏置/去噪训练/查询向量漂移Key words
instance segmentation paradigm/Mask Feature Fusion(MFF)/non-local global bias/denoising training/query vector shifting分类
计算机与自动化引用本文复制引用
李伟康,张思全..掩模特征融合:实例分割新范式[J].计算机工程,2025,51(2):126-138,13.基金项目
国家自然科学基金(51175321). (51175321)