农业机械学报2026,Vol.57Issue(1):72-82,11.DOI:10.6041/j.issn.1000-1298.2026.01.007
基于自监督学习的玉米植株图像小样本语义分割模型
Self-supervised Few-shot Semantic Segmentation Model for Maize Plant Images
摘要
Abstract
Image semantic segmentation technology is one of the key methods for obtaining phenotypic information of maize plants.Traditional fully supervised semantic segmentation methods typically rely on a large number of pixel-level labels.However,maize exhibits significant morphological variability across different growth stages,leading to high costs associated with image annotation and limiting the practical application of such models in real-world production scenarios.To eliminate the need for manual annotation during model training,a self-supervised few-shot semantic segmentation network for maize plant images(MSDANet)was proposed based on self-supervised learning,aiming to improve the semantic segmentation accuracy and model generalization capability of maize plant images across different growth stages.MSDANet utilized a superpixel-based self-supervised learning method to generate pseudo labels,enabling the construction of preliminary supervision signals for the support set images without manual annotation.It designed a mixed masking mechanism(MM)that applied pseudo label-based semantic masking to construct diverse masked samples in the feature space,promoting the model to learn more robust feature representations and thereby improving segmentation accuracy in complex backgrounds.To address the complex morphological issues of corn plants in images,such as bending,overlapping,and occlusion,a multi-scale deformable large kernel attention mechanism(MS-DLKA)for the model was designed.By integrating multi-scale receptive fields and deformable convolutions,it can flexibly perceive important structural information of corn plants at different scales,effectively improving semantic segmentation accuracy.When validated on a small sample dataset,MSDANet achieved mIoU and FB-IoU of 75.63%and 87.12%,respectively,in the 1-shot setting;in the 5-shot setting,mIoU and FB-IoU reached 76.04%and 87.21%,respectively,both outperforming other models of the same type proposed in this study.Additionally,compared with current mainstream fully supervised few-shot semantic segmentation models,mIoU was improved by 2.9 and 2.93 percentage points under 1-shot and 5-shot settings,respectively.The results demonstrated that the MSDANet model can achieve high-precision semantic segmentation of corn plant images without human labels and with few samples,providing technical support for corn image analysis and plant phenotyping at different growth stages.关键词
玉米图像/植物表型/图像处理/深度学习/语义分割/自监督学习Key words
maize images/plant phenotype/image processing/deep learning/semantic segmentation/self-supervised learning分类
农业科技引用本文复制引用
DENG Hanbing,LIU Xin,LI Chaoyang,MIAO Teng..基于自监督学习的玉米植株图像小样本语义分割模型[J].农业机械学报,2026,57(1):72-82,11.基金项目
国家重点研发计划项目(2022YFD2002303-01)和辽宁省教育厅基本科研项目面上项目(JYTM20231303) (2022YFD2002303-01)