| 注册
首页|期刊导航|智慧农业(中英文)|基于人工智能大模型技术的果蔬农技知识智能问答系统

基于人工智能大模型技术的果蔬农技知识智能问答系统

王婷 王娜 崔运鹏 刘娟

智慧农业(中英文)2023,Vol.5Issue(4):105-116,12.
智慧农业(中英文)2023,Vol.5Issue(4):105-116,12.DOI:10.12133/j.smartag.SA202311005

基于人工智能大模型技术的果蔬农技知识智能问答系统

Agricultural Technology Knowledge Intelligent Question-Answering System Based on Large Language Model

王婷 1王娜 2崔运鹏 1刘娟1

作者信息

  • 1. 中国农业科学院农业信息研究所,北京 100081,中国||农业农村部农业大数据重点实验室 ,北京 100081,中国
  • 2. 96962部队,北京 102206,中国
  • 折叠

摘要

Abstract

[Objective]The rural revitalization strategy presents novel requisites for the extension of agricultural technology.However,the con-ventional method encounters the issue of a contradiction between supply and demand.Therefore,there is a need for further innovation in the supply form of agricultural knowledge.Recent advancements in artificial intelligence technologies,such as deep learning and large-scale neural networks,particularly the advent of large language models(LLMs),render anthropomorphic and intelligent agricul-tural technology extension feasible.With the agricultural technology knowledge service of fruit and vegetable as the demand orienta-tion,the intelligent agricultural technology question answering system was built in this research based on LLM,providing agricultural technology extension services,including guidance on new agricultural knowledge and question-and-answer sessions.This facilitates farmers in accessing high-quality agricultural knowledge at their convenience. [Methods]Through an analysis of the demands of strawberry farmers,the agricultural technology knowledge related to strawberry cultivation was categorized into six themes:basic production knowledge,variety screening,interplanting knowledge,pest diagnosis and control,disease diagnosis and control,and drug damage diagnosis and control.Considering the current situation of agricultural technology,two primary tasks were formulated:named entity recognition and question answering related to agricultural knowledge.A training corpus comprising entity type annotations and question-answer pairs was constructed using a combination of automatic ma-chine annotation and manual annotation,ensuring a small yet high-quality sample.After comparing four existing Large Language Models(Baichuan2-13B-Chat,ChatGLM2-6B,Llama 2-13B-Chat,and ChatGPT),the model exhibiting the best performance was chosen as the base LLM to develop the intelligent question-answering system for agricultural technology knowledge.Utilizing a high-quality corpus,pre-training of a Large Language Model and the fine-tuning method,a deep neural network with semantic analysis,context association,and content generation capabilities was trained.This model served as a Large Language Model for named entity recognition and question answering of agricultural knowledge,adaptable to various downstream tasks.For the task of named entity recognition,the fine-tuning method of Lora was employed,fine-tuning only essential parameters to expedite model training and en-hance performance.Regarding the question-answering task,the Prompt-tuning method was used to fine-tune the Large Language Model,where adjustments were made based on the generated content of the model,achieving iterative optimization.Model perfor-mance optimization was conducted from two perspectives:data and model design.In terms of data,redundant or unclear data was manually removed from the labeled corpus.In terms of the model,a strategy based on retrieval enhancement generation technology was employed to deepen the understanding of agricultural knowledge in the Large Language Model and maintain real-time synchroni-zation of knowledge,alleviating the problem of LLM hallucination.Drawing upon the constructed Large Language Model,an intelli-gent question-answering system was developed for agricultural technology knowledge.This system demonstrates the capability to gen-erate high-precision and unambiguous answers,while also supporting the functionalities of multi-round question answering and re-trieval of information sources. [Results and Discussions]Accuracy rate and recall rate served as indicators to evaluate the named entity recognition task perfor-mance of the Large Language Models.The results indicated that the performance of Large Language Models was closely related to factors such as model structure,the scale of the labeled corpus,and the number of entity types.After fine-tuning,the ChatGLM Large Language Model demonstrated the highest accuracy and recall rate.With the same number of entity types,a higher number of annotat-ed corpora resulted in a higher accuracy rate.Fine-tuning had different effects on different models,and overall,it improved the aver-age accuracy of all models under different knowledge topics,with ChatGLM,Llama,and Baichuan values all surpassing 85%.The av-erage recall rate saw limited increase,and in some cases,it was even lower than the values before fine-tuning.Assessing the question-answering task of Large Language Models using hallucination rate and semantic similarity as indicators,data optimization and retriev-al enhancement generation techniques effectively reduced the hallucination rate by 10%to 40%and improved semantic similarity by more than 15%.These optimizations significantly enhanced the generated content of the models in terms of correctness,logic,and comprehensiveness. [Conclusion]The pre-trained Large Language Model of ChatGLM exhibited superior performance in named entity recognition and question answering tasks in the agricultural field.Fine-tuning pre-trained Large Language Models for downstream tasks and optimiz-ing based on retrieval enhancement generation technology mitigated the problem of language hallucination,markedly improving mod-el performance.Large Language Model technology has the potential to innovate agricultural technology knowledge service modes and optimize agricultural knowledge extension.This can effectively reduce the time cost for farmers to obtain high-quality and effective knowledge,guiding more farmers towards agricultural technology innovation and transformation.However,due to challenges such as unstable performance,further research is needed to explore optimization methods for Large Language Models and their application in specific scenarios.

关键词

大模型/生成式预训练变换器/农技知识/智能问答/命名实体识别

Key words

LLM/generative pre-trained transformer/agricultural technology knowledge/intelligent question answering/name entity identity

分类

信息技术与安全科学

引用本文复制引用

王婷,王娜,崔运鹏,刘娟..基于人工智能大模型技术的果蔬农技知识智能问答系统[J].智慧农业(中英文),2023,5(4):105-116,12.

基金项目

北京市数字农业创新团队项目(BAIC10-2023) (BAIC10-2023)

中国农业科学院基本科研业务费项目(JBYW-AII-2023-31) (JBYW-AII-2023-31)

国家重点研发计划项目(2022YFF0711902) Beijing Digital Agriculture Innovation Consortium Project(BAIC10-2023) (2022YFF0711902)

Fundamental Research Funds of Chi-nese Academy of Agricultural Sciences(JBYW-AII-2023-31) (JBYW-AII-2023-31)

National Key Research and Development Program of China(2022YFF0711902) (2022YFF0711902)

智慧农业(中英文)

OACSCDCSTPCD

2096-8094

访问量0
|
下载量0
段落导航相关论文