| 注册
首页|期刊导航|农业大数据学报|基于参数高效微调的跨模态枸杞虫害识别模型D-PAG

基于参数高效微调的跨模态枸杞虫害识别模型D-PAG

邢嘉璐 刘建平 周国民 刘立波 王健

农业大数据学报2024,Vol.6Issue(4):509-521,13.
农业大数据学报2024,Vol.6Issue(4):509-521,13.DOI:10.19788/j.issn.2096-6369.000067

基于参数高效微调的跨模态枸杞虫害识别模型D-PAG

D-PAG:Cross-modal Wolfberry Pest Recognition Model Based on Parameter-Efficient Fine-Tuning

邢嘉璐 1刘建平 2周国民 3刘立波 4王健5

作者信息

  • 1. 北方民族大学计算机科学与工程学院,银川 750021
  • 2. 北方民族大学计算机科学与工程学院,银川 750021||图像图形智能处理国家民委重点实验室,银川 750021
  • 3. 农业农村部南京农业机械化研究所,南京 210014||中国农业科学院农业信息研究所国家农业科学数据中心,北京 100081
  • 4. 宁夏大学信息工程学院,银川 750021
  • 5. 中国农业科学院农业信息研究所,北京 100081
  • 折叠

摘要

Abstract

With the development of multimodal foundation models(large models),efficiently transferring them to specific domains or tasks has become a current hot topic.This study uses the multimodal large model CLIP as the base model and employs parameter-efficient fine-tuning methods,such as Prompt and Adapter,to adapt CLIP to the task of goji berry pest identification.It introduces a cross-modal parameter-efficient fine-tuning model for goji berry pest recognition,named D-PAG.Firstly,learnable Prompts and Adapters are embedded in the input or hidden layers of the CLIP encoder to capture pest features.Then,gated units are utilized to integrate the Prompt and Adapter,further balancing the learning capacity.A GCS-Adapter is designed within the Adapter to enhance the attention mechanism for cross-modal semantic information fusion.To validate the effectiveness of the method,experiments were conducted on the goji berry pest dataset and the fine-grained dataset IP102.The experimental results indicate that with only 20%of the sample size,an accuracy of 98.8%was achieved on the goji dataset,and an accuracy of 99.5%was reached with 40%of the samples.On IP102,an accuracy of 75.6%was attained,comparable to ViT.This approach allows for efficient transfer of the foundational knowledge of multimodal large models to the specific domain of pest recognition with minimal additional parameters,providing a new technical solution for efficiently addressing agricultural image processing problems.

关键词

枸杞/虫害识别/参数高效微调/大模型/CLIP

Key words

wolfberry/pest identification/parameter-efficient fine-tuning/large model/CLIP

引用本文复制引用

邢嘉璐,刘建平,周国民,刘立波,王健..基于参数高效微调的跨模态枸杞虫害识别模型D-PAG[J].农业大数据学报,2024,6(4):509-521,13.

基金项目

国家自然科学基金项目(32460444),北方民族大学重点科研项目(2023ZRLG12),北方民族大学研究生创新项目(YCX24126). (32460444)

农业大数据学报

OACSTPCD

2096-6369

访问量0
|
下载量0
段落导航相关论文