| 注册
首页|期刊导航|现代情报|基于大小模型协同的情报学理论实体抽取研究

基于大小模型协同的情报学理论实体抽取研究

姚汝婧 王芳

现代情报2025,Vol.45Issue(4):3-11,73,10.
现代情报2025,Vol.45Issue(4):3-11,73,10.DOI:10.3969/j.issn.1008-0821.2025.04.001

基于大小模型协同的情报学理论实体抽取研究

Research on Information Science Theoretical Entity Extraction Based on Collaboration Between Large and Small Language Models

姚汝婧 1王芳1

作者信息

  • 1. 南开大学商学院信息资源管理系,天津 300071||南开大学网络社会治理研究中心,天津 300071
  • 折叠

摘要

Abstract

[Purpose/Significance]Theory is an essential component in the construction and development of the disci-pline of information science.Organization and Analysis of theories not only help understand the origins and developmental trajectories of the discipline but also predict the development of emerging technologies.Efficient and accurate identification of theoretical entities plays a crucial role in deepening theoretical research.[Method/Process]This paper proposed an in-formation science theory extraction algorithm that collaborates between large and small language models,including modules for enhanced word embedding vectors,sample difficulty assessment,and a theoretical identification model.Initially,the paper used large language models to pre-identify theoretical entities.These pre-identified entities,combined with the orig-inal word embeddings,formed the enhanced word embeddings.The training process of domain-specific small models was optimized through these enhanced word embedding vectors.Additionally,the paper used large language models to assess the difficulty of samples and adjusts training strategies accordingly to improve model performance.The proposed algorithm fully integrated the large language models' powerful semantic understanding capabilities and the professionalism of domain-specific small models.[Result/Conclusion]Experiments conducted on a dataset for the extraction of theoretical entities in information science show that the algorithm proposed in this paper effectively improves the performance of theoretical entity extraction,achieving the best results in the metrics of precision,recall,and F1 score.

关键词

大型语言模型/情报学理论/实体识别/样本学习难度/模型协同

Key words

large language model/information science theory/entity recognition/sample learning difficulty/model collaboration

分类

计算机与自动化

引用本文复制引用

姚汝婧,王芳..基于大小模型协同的情报学理论实体抽取研究[J].现代情报,2025,45(4):3-11,73,10.

基金项目

国家社会科学基金重大项目"基于数据共享与知识复用的数字政府智能化治理研究"(项目编号:20ZDA039). (项目编号:20ZDA039)

现代情报

OA北大核心

1008-0821

访问量0
|
下载量0
段落导航相关论文