首页|期刊导航|现代情报|大语言模型驱动的北洋政府文书资源知识图谱构建研究

大语言模型驱动的北洋政府文书资源知识图谱构建研究

邓君张子姝潘禹兵叶东宇常严予

现代情报2026，Vol.46Issue(4)：57-67,11.

现代情报2026，Vol.46Issue(4)：57-67,11.DOI:10.3969/j.issn.1008-0821.2026.04.005

大语言模型驱动的北洋政府文书资源知识图谱构建研究

Research on the Construction of Beiyang Government Document Resources Knowledge Graph Driven by Large Language Models

邓君 ¹张子姝 ¹潘禹兵 ¹叶东宇 ¹常严予¹

作者信息

1. 吉林大学商学与管理学院,吉林长春 130012
折叠

摘要

Abstract

[Purpose/Significance]This paper employs a knowledge graph built using a large language model to address the problem of intelligent development and application of the Beiyang Government Document Resources,transforming fragmented and isolated historical documents into a deep semantic network system with the goal of advancing intelligent historical research and public historical transmission.[Method/Process]This study designed a framework for constructing a knowledge graph of Beiyang Government Document Resources driven by large language models.It relied on the KGGen knowledge graph generation model,integrating the entire process of knowledge representation modeling,entity-relationship extraction,and knowledge graph generation.Initially,data preprocessing was carried out,and a data collec-tion and preprocessing workflow covering structured,semi-structured,and unstructured texts was designed.Combined with the requirements of the large language model task,corpus cleaning,word segmentation analysis,and data annotation were completed,thereby forming a standardized corpus of Beiyang Government Documents Resources in the field.Subse-quently,this paper designed a knowledge representation model of Beiyang Government Document Resources for the large language models extraction task.It summarized category labels including institutions,individuals,positions,decrees,documents,locations,and events,as well as sixty relationship labels covering appointments,nominations,succession,removal,resignation,and leadership relationships.The paper conducted ablation experiments and used accuracy,recall,and F1 as evaluation metrics.The experimental results showed that the framework proposed in this paper performed best in the knowledge extraction task on Beiyang Government Document Resources,largely because the paper accurately anno-tated entities and relationships in the preprocessing stage and applied constraints from the knowledge representation model in the extraction stage.In the concluding phase,the KGGen model was deployed to construct the knowledge graph of Bei-yang Government Document Resources.Thereafter,visual analysis was conducted based on the constructed knowledge graph,and intelligent question-answering services were provided.[Result/Conclusion]Experimental results show that in the two tasks of entity recognition and relationship extraction,the KGGen model outperforms the comparison models in all evaluation indicators.This framework effectively reveals the inherent knowledge structure of Beiyang Government Docu-ment Resources,constructs a high-quality systematic knowledge representation,and provides reusable and transferable methodological references for the mapping and construction of low-resource modern historical documents.

关键词

大语言模型/北洋政府文书资源/知识图谱/KGGen模型/知识抽取

Key words

large language models/beiyang government document resources/knowledge graph/KGGen/knowledge extraction

分类

社会科学

引用本文复制引用

邓君,张子姝,潘禹兵,叶东宇,常严予..大语言模型驱动的北洋政府文书资源知识图谱构建研究[J].现代情报,2026,46(4):57-67,11.

基金项目

国家社会科学基金重点项目"国家文化数字化战略下档案数据资源挖掘与智慧服务研究"(项目编号:23ATQ001). （项目编号:23ATQ001）

现代情报

OACHSSCD

ISSN：1008-0821

访问量0

下载量0

段落导航