智慧农业(中英文)2024,Vol.6Issue(3):128-137,10.DOI:10.12133/j.smartag.SA202401020
基于改进UperNet的结球甘蓝叶球识别方法
Identification Method of Kale Leaf Ball Based on Improved UperNet
摘要
Abstract
[Objective]Kale is an important bulk vegetable crop worldwide,its main growth characteristics are outer leaves and leaf bulbs.The traits of leaf bulb kale are crucial for adjusting water and fertilizer parameters in the field to achieve maximum yield.However,vari-ous factors such as soil quality,light exposure,leaf overlap,and shading can affect the growth of in practical field conditions.The sim-ilarity in color and texture between leaf bulbs and outer leaves complicates the segmentation process for existing recognition models.In this paper,the segmentation of kale outer leaves and leaf bulbs in complex field background was proposed,using pixel values to de-termine leaf bulb size for intelligent field management.A semantic segmentation algorithm,UperNet-ESA was proposed to efficiently and accurately segment nodular kale outer leaf and leaf bulb in field scenes using the morphological features of the leaf bulbs and out-er leaves of nodular kale to realize the intelligent management of nodular kale in the field. [Methods]The UperNet-ESA semantic segmentation algorithm,which uses the unified perceptual parsing network(UperNet)as an ef-ficient semantic segmentation framework,is more suitable for extracting crop features in complex environments by integrating seman-tic information across different scales.The backbone network was improved using ConvNeXt,which is responsible for feature extrac-tion in the model.The similarity between kale leaf bulbs and outer leaves,along with issues of leaf overlap affecting accurate target contour localization,posed challenges for the baseline network,leading to low accuracy.ConvNeXt effectively combines the strengths of convolutional neural networks(CNN)and Transformers,using design principles from Swin Transformer and building upon ResNet50 to create a highly effective network structure.The simplicity of the ConvNeXt design not only enhances segmentation accu-racy with minimal model complexity,but also positions it as a top performer among CNN architectures.In this study,the ConvNeXt-B version was chosen based on considerations of computational complexity and the background characteristics of the knotweed kale image dataset.To enhance the model's perceptual acuity,block ratios for each stage were set at 3:3:27:3,with corresponding channel numbers of 128,256,512 and 1 024,respectively.Given the visual similarity between kale leaf bulbs and outer leaves,a high-efficien-cy channel attention mechanism was integrated into the backbone network to improve feature extraction in the leaf bulb region.By in-corporating attention weights into feature mapping through residual inversion,attention parameters were cyclically trained within each block,resulting in feature maps with attentional weights.This iterative process facilitated the repeated training of attentional parame-ters and enhanced the capture of global feature information.To address challenges arising from direct pixel addition between up-sam-pling and local features,potentially leading to misaligned context in feature maps and erroneous classifications at kale leaf boundaries,a feature alignment module and feature selection module were introduced into the feature pyramid network to refine target boundary information extraction and enhance model segmentation accuracy. [Results and Discussions]The UperNet-ESA semantic segmentation model outperforms the current mainstream UNet model,PSPNet model,DeepLabV3+model in terms of segmentation accuracy,where mIoU and mPA reached 92.45%and 94.32%,respectively,and the inference speed of up to 16.6 frames per second(fps).The mPA values were better than that of the UNet model,PSPNet model,ResNet-50 based,MobilenetV2,and DeepLabV3+model with Xception as the backbone,showing improvements of 11.52%,13.56%,8.68%,4.31%,and 6.21%,respectively.Similarly,the mIoU exhibited improvements of 12.21%,13.04%,10.65%,3.26%and 7.11%compared to the mIoU of the UNet-based model,PSPNet model,and DeepLabV3+model based on the ResNet-50,MobilenetV2,and Xception backbones,respectively.This performance enhancement can be attributed to the introduction of the ECA module and the im-provement made to the feature pyramid network in this model,which strengthen the judgement of the target features at each stage to obtain effective global contextual information.In addition,although the PSPNet model had the fastest inference speed,the overall ac-curacy was too low to for developing kale semantic segmentation models.On the contrary,the proposed model exhibited superior in-ference speed compared to all other network models. [Conclusions]The experimental results showed that the UperNet-ESA semantic segmentation model proposed in this study outper-forms the original network in terms of performance.The improved model achieves the best accuracy-speed balance compared to the current mainstream semantic segmentation networks.In the upcoming research,the current model will be further optimized and en-hanced,while the kale dataset will be expanded to include a wider range of samples of nodulated kale leaf bulbs.This expansion is in-tended to provide a more robust and comprehensive theoretical foundation for intelligent kale field management.关键词
结球甘蓝/语义分割/叶球识别/注意力机制/特征选择/特征对齐Key words
kale/semantic segmentation/leafball identification/attention mechanism/feature selection/feature alignment分类
信息技术与安全科学引用本文复制引用
朱轶萍,吴华瑞,郭旺,吴小燕..基于改进UperNet的结球甘蓝叶球识别方法[J].智慧农业(中英文),2024,6(3):128-137,10.基金项目
"十四五"国家重点研发计划项目(2022YFD1600602) (2022YFD1600602)
财政部和农业农村部:国家现代农业产业技术体系资助(CARS-23-D07) National Key Research and Development Programme(2022YFD1600602) (CARS-23-D07)
Ministry of Finance and Ministry of Ag-riculture and Rural Development:Funding for the National Modern Agricultural Industry Technology System(CARS-23-D07) (CARS-23-D07)