农业大数据学报2024,Vol.6Issue(1):1-8,8.DOI:10.19788/j.issn.2096-6369.100002
农业知识图谱构建数据集
A Dataset for Constructing Agricultural Knowledge Graph
摘要
Abstract
Improving the efficiency of agricultural production and optimizing the problems in agricultural production through information technology is crucial for the development of agriculture in China.At present,the development of information technology has generated massive amounts of data,which are mostly distributed on the Internet in fragmented and unstructured forms.Especially in the domain of agriculture,using traditional search engines for information retrieval is difficult to efficiently and accurately obtain valuable agricultural information,often requiring a lot of time and effort to collect and organize secondary data from massive unorganized data.To address the above issues,this paper utilizes web crawler technology to mine data from publicly available agricultural websites.Through automatic or semi-automatic data cleaning,denoising,and other processes,unstructured data are recombined into structured data,which is ultimately stored in the form of a knowledge graph.The dataset for constructing agricultural knowledge graph includes item data for 11 agricultural categories,such as grain crops,cash crops,fruits,vegetables,etc.Specifically,it includes 461 types of grain crops,2 208 types of cash crops,1 294 types of fruits,257 types of vegetables,118 types of edible fungi,1 161 types of flowers and trees,142 types of aquatic products,113 types of pesticides,1 605 types of crop diseases and pests,519 types of veterinary drugs,and 603 types of Chinese herbal medicines,totaling 8 481 subcategories.The agricultural knowledge graph constructed based on this dataset has 90 508 triplets,which can provide basic data support for the development of human-machine interactive intelligent applications such as agricultural knowledge Q&A and recommendation systems.Meanwhile,integrating agricultural knowledge graph into generative large language models can help achieve more efficient and accurate information retrieval and intelligent decision-making in vertical domains.关键词
农业数据/网络挖掘/知识图谱/数据集Key words
agricultural data/network mining/knowledge mapping/datasets引用本文复制引用
陈雷,周娜,朱芃璇,袁媛..农业知识图谱构建数据集[J].农业大数据学报,2024,6(1):1-8,8.基金项目
国家自然科学基金项目(32071901,32271981) (32071901,32271981)
国家基础学科公共科学数据中心课题(NBSDC-DB-20) (NBSDC-DB-20)