首页|期刊导航|情报杂志|基于大语言模型的参考文献自动识别与著录信息抽取

基于大语言模型的参考文献自动识别与著录信息抽取

陈和

情报杂志2025，Vol.44Issue(7)：192-198,7.

情报杂志2025，Vol.44Issue(7)：192-198,7.DOI:10.3969/j.issn.1002-1965.2025.07.023

基于大语言模型的参考文献自动识别与著录信息抽取

Reference Automatic Identification and Bibliographic Information Extraction Based on Large Language Models

陈和¹

作者信息

1. 厦门大学图书馆厦门 361005
折叠

摘要

Abstract

[Research purpose]Utilizing large language models to automatically identify references one by one from references text data and automatically extract the bibliographic information of the identified reference,it can provide new ideas and methods for text recognition-related work.[Research method]Through Python programming and case study methods,we design and optimize the Prompt template,calling the service API interface of Baidu Qianfan ERNIE-Speed large language model for question-and-answer interaction to achieve au-tomatic identification of references one by one from references text data and further automatically extract the bibliographic information such as authors,title,publication name,publication year and other bibliographic information from each reference.[Research result/conclu-sion]Compared with traditional text recognition methods,using large language models to automatically identify references and extract bib-liographic information has the advantages of low usage threshold,lenient requirements for target text data,high text recognition accuracy,and high extraction efficiency.At the same time,large language models also have limitations such as the length of input and output content is limited,"hallucinations"and"politeness"behaviors that increase the complexity of data processing.

关键词

大语言模型/文本挖掘/文本识别/信息抽取/参考文献/著录规则

Key words

large language model/text mining/text recognition/information extraction/references/bibliographic rules

分类

社会科学

引用本文复制引用

陈和..基于大语言模型的参考文献自动识别与著录信息抽取[J].情报杂志,2025,44(7):192-198,7.

基金项目

福建省社会科学基金项目"两岸融合发展背景下大陆科技文献在台学术影响力实证研究"(编号:FJ2022B140)研究成果. （编号:FJ2022B140）

情报杂志

OA北大核心

ISSN：1002-1965

访问量0

下载量0

段落导航