农业图书情报学报2025,Vol.37Issue(3):4-17,14.DOI:10.13998/j.cnki.issn1002-1248.25-0218
DeepSeek赋能领域知识图谱低成本构建研究
Research on DeepSeek-Empowered Low-Cost Construction of Domain-Specific Knowledge Graphs
摘要
Abstract
[Purpose/Significance]In recent years,large language models(LLMs)have achieved revolutionary breakthroughs in semantic understanding and generation capabilities through massive text pre-training.This has injected brand-new impetus into the field of knowledge engineering.As a structured knowledge carrier,the knowledge graph has unique advantages in integrating heterogeneous data from multiple sources and constructing an industrial knowledge system.In the context of a paradigm shift in the field of knowledge engineering driven by the emergence of open-source LLMs such as DeepSeek,this study proposes a cost-effective method for constructing domain knowledge graphs based on DeepSeek.We aim to address the limitations of traditional domain knowledge graphs,such as high dependence on expert rules,the high cost of manual annotation,and inefficient processing of multi-source data.[Method/Process]We proposed the semantic understanding-enhanced,cue-engineered domain knowledge extraction technology system,constructed on the methodological framework of manually constructing ontology modelling.In order to process the acquired data,the ETL\MinerU and other tools were used,and the DeepSeek-R1application programming interface was then invoked for intelligent extraction.The ontology model was designed based on domain cognitive features and the multi-source heterogeneous data fusion method was used to achieve the unified characterization of the data structure.Furthermore,the DeepSeek and knowledge extraction were combined.Our system provides a cost-effective reusable technical paradigm for constructing domain knowledge graphs,as well as efficient knowledge extraction,leveraging the advanced powerful textual reasoning ability of the DeepSeek model.[Results/Conclusions]In this study,we take the construction of a domain knowledge map of the entire pig industrial chain as an empirical object.We define the structure of the industrial chain,identify 21 types of core entities and describe their attribute relationships.We achieve the knowledge modelling of the pig industry with a focus on smart farming.The methodology developed in this research was also employed to process and extract knowledge from online and offline resource data.Preliminary experiments demonstrate that DeepSeek-R1 exhibits an F1 value of 0.92 when recognizing the attributes of 161 diseases and 11 types of entities in pig disease control scenarios under zero-sample learning conditions.These experiments also ascertain the reusability of the methodology for other links in the chain.Concurrently,the constructed knowledge map of the entire industrial chain of pigs will be utilized for the design and validation of intelligent application scenarios,with the objective of promoting the intelligent information processing in the pig industry.This study proposes a synergistic paradigm for constructing domain knowledge graphs using DeepSeek,a method that combines deep learning with manual calibration for efficient knowledge extraction and ensure accuracy.This approach ensures the efficiency of knowledge extraction and verifies the knowledge extraction potential of LLMs in vertical domains.The study's findings contribute to the extant literature and offer a practical reference for the promotion of DeepSeek-enabled cost-effective construction of knowledge graphs.关键词
DeepSeek/知识抽取/知识图谱/零样本/知识底座/生猪/全产业链Key words
DeepSeek/knowledge extraction/knowledge graph/zero-shot learning/knowledge foundation/swine/whole industry chain分类
社会科学引用本文复制引用
史忠艳,雷洁,孙坦,赵瑞雪,李娇,黄永文,鲜国建..DeepSeek赋能领域知识图谱低成本构建研究[J].农业图书情报学报,2025,37(3):4-17,14.基金项目
国家社会科学基金一般项目"多模态科技资源的语义组织与关联发现服务研究"(22BTQ079) (22BTQ079)
中国科协青年人才托举工程项目"面向科研论文的科学论证语义识别与解析研究"(2022QNRC001) (2022QNRC001)