健康发展与政策研究2026,Vol.29Issue(1):156-165,10.DOI:10.12458/HDPR.202505019
医药监管领域的术语挖掘及知识图谱构建
Terminology extraction and knowledge graph construction in pharmaceutical regulation
摘要
Abstract
Objective This study attempts to develop an AI-driven knowledge management method for use in pharmaceutical regulatory contexts,so as to enhance the visualization,retrieval,and applicability of regulatory knowledge.Method Based on pharmaceutical regulatory texts,a terminology extraction method leveraging a supervised fine-tuned large language model(LLM)was proposed,on the basis of which a structured knowledge graph(KG)for pharmaceutical regulation was constructed.Results Comparative evaluation identified Qwen2.5-7B-Instruct-sft as the optimal extraction model,with a precision of 0.363,recall of 0.838,and an F1-score of 0.506.Using a multi-feature fusion-based term filtering method(Domain+G-value),the precision,recall,and F1-score reached to 0.857,0.823,and 0.840,respectively.The fine-tuned LLM initially extracted 63 672 candidate terms,which were refined to 7 359 following the filtering process.The resulting pharmaceutical regulatory KG comprises 9 719 entities and 33 216 relations.Conclusion This study constructs a pharmaceutical regulatory KG based on a domain terminology lexicon,which effectively organizes fragmented regulatory information into a structured and standardized knowledge system and enables the visualized representation of regulatory texts.关键词
医药监管/术语挖掘/知识图谱/大语言模型Key words
pharmaceutical regulation/terminology extraction/knowledge graph/large language model分类
医药卫生引用本文复制引用
莫鸿仪,杨铖锴,王皓阳,侯凤贞,甘珏..医药监管领域的术语挖掘及知识图谱构建[J].健康发展与政策研究,2026,29(1):156-165,10.基金项目
江苏省高校哲学社会科学研究一般项目(2022SJYB0058) (2022SJYB0058)
教育部人文社会科学研究一般项目(24YJCZH441) (24YJCZH441)
国家级大学生创新创业训练计划项目(202510316067) (202510316067)