MLDAC:多任务密集注意计算自监督小样本分割方法OA北大核心
MLDAC:Multi-Task Dense Attention Computation Self-Supervised Few-Shot Semantic Segmentation Method
针对现有小样本语义分割方法仍然需要大量的像素级注释来完成模型训练的问题,提出了一种多任务密集注意计算自监督小样本分割方法(multi-task dense attention computation self-supervised few-shot semantic seg-mentation method,MLDAC).该方法将数据集中单张图像的无监督显著性区域分割为两部分,一部分作为小样本分割的支持掩码,另一部分与整体区域分别和预测结果计算交叉熵损失作为多个任务完成多任务学习,提升了模型泛化能力;主干网络基于Swin Transformer来提取多尺度特征,并将其输入到多尺度密集注意力块,充分利用了多尺度的像素级相关性;经尺度间混合和跳过连接操作得到最终的预测结果.实验结果表明,基于自监督的MLDAC方法在PASCAL-5i和COCO-20i数据集上1-shot mIoU分别达到了55.1%和26.8%,相较于当前最优的自监督小样本语义分割方法分别提升了1.3和2.2个百分点;在FSS-1000数据集上1-shot mIoU达到了78.1%,证实了MLDAC方法的有效性.
Aiming at the problem that existing few-shot semantic segmentation methods still need a large number of pixel-level annotations to complete the training of models,a multi-task dense attention computation self-supervised few-shot semantic segmentation method(MLDAC)is proposed.The method divides the saliency of a single image in the dataset into two parts,one part serves as the support image mask for few-shot segmentation,the other part or the all saliency respec-tively makes the cross-entropy loss of the prediction result as multiple targets for multi-task learning,improving model generalization.The Swin Transformer is used for the backbone network to extract feature maps at different levels.These feature maps are input into multiple levels of dense attention computation blocks to enhance pixel-level correspondence.The final prediction results are obtained by using the inter-scale mixing and feature skip-connection.The experimental results indicate that MLDAC attains 55.1%and 26.8%1-shot mIoU self-supervised few-shot segmentation on the PASCAL-5i and COCO-20i datasets respectively,compared with the current best self-supervised few-shot semantic seg-mentation method,improves by 1.3 and 2.2 percentage points respectively.In addition,the model achieves 78.1%1-shot mIoU on FSS-1000 dataset,verifying its efficacy.
王炜航;张轶
四川大学 计算机学院,成都 610065四川大学 计算机学院,成都 610065
计算机与自动化
多任务学习小样本语义分割Swin Transformer自监督学习
multi-task learningfew-shot semantic segmentationSwin Transformerself-supervised learning
《计算机工程与应用》 2025 (4)
211-221,11
国家自然科学基金(U20A20161).
评论