| 注册
首页|期刊导航|软件导刊|基于大语言模型的材料科学信息抽取

基于大语言模型的材料科学信息抽取

时宗彬 乐小虬

软件导刊2026,Vol.25Issue(1):1-9,9.
软件导刊2026,Vol.25Issue(1):1-9,9.DOI:10.11907/rjdk.241885

基于大语言模型的材料科学信息抽取

Information Extraction in Materials Science Based on Large Language Model

时宗彬 1乐小虬1

作者信息

  • 1. 中国科学院 文献情报中心||中国科学院大学 经济与管理学院,北京 100190
  • 折叠

摘要

Abstract

The distinctive terminology and intricate semantic relationships inherent to scientific literature continue to present a significant ob-stacle to directly utilizing existing LLM for the extraction of specialized domain knowledge.This study aims to address the limitations in the knowledge extraction capabilities of LLM within the field of materials science.We put forth a methodology to augment their efficacy through do-main-specific fine-tuning and illustrate the deployment of the fine-tuned LLM for the extraction of pivotal information from materials science literature.The preliminary stage of the methodology utilises LLM with robust linguistic capabilities and prompt engineering to facilitate the manual annotation of the materials domain dataset.Subsequently,fine-tuning of QLoRA is performed on LLM with parameter 7B,thereby en-abling the fine-tuned LLM to accurately extract information from materials science literature in accordance with the instructions.The fine-tuned LLM demonstrated exceptional performance in material entity recognition,achieving an F1 score of 0.94.It also showed high accuracy in material type recognition and material property value extraction,with scores of 0.91 and 0.89,respectively.The experimental results demon-strate that the method is effective in addressing complex terminology and semantic relationships in materials science literature,exhibiting ro-bust generalization capabilitiesand offers a novel approach to knowledge extraction in the field of materials science.

关键词

大语言模型/提示工程/QLoRA/指令微调/信息抽取

Key words

large language models/prompt engineering/QLoRA/instruction fine-tuning/information extraction

分类

信息技术与安全科学

引用本文复制引用

时宗彬,乐小虬..基于大语言模型的材料科学信息抽取[J].软件导刊,2026,25(1):1-9,9.

基金项目

国家社会科学基金项目(23BTQ102) (23BTQ102)

软件导刊

1672-7800

访问量0
|
下载量0
段落导航相关论文