数字中医药(英文)2026,Vol.9Issue(1):1-12,12.DOI:10.1016/j.dcmed.2026.02.002
QingNangTCM:一种面向中医领域的参数高效微调大语言模型
QingNangTCM:a parameter-efficient fine-tuning large language model for traditional Chinese medicine
摘要
Abstract
Objective To develop QingNangTCM,a specialized large language model(LLM)tailored for expert-level traditional Chinese medicine(TCM)question-answering and clinical reasoning,addressing the scarcity of domain-specific corpora and specialized alignment. Methods We constructed QnTCM_Dataset,a corpus of 100 000 entries,by integrating data from ShenNong_TCM_Dataset and SymMap v2.0,and synthesizing additional samples via re-trieval-augmented generation(RAG)and persona-driven generation.The dataset compre-hensively covers diagnostic inquiries,prescriptions,and herbal knowledge.Utilizing P-Tun-ing v2,we fine-tuned the GLM-4-9B-Chat backbone to develop QingNangTCM.A multi-dimensional evaluation framework,assessing accuracy,coverage,consistency,safety,profes-sionalism,and fluency,was established using metrics such as bilingual evaluation under-study(BLEU),recall-oriented understudy for gisting evaluation(ROUGE),metric for evalua-tion of translation with explicit ordering(METEOR),and LLM-as-a-Judge with expert review.Qualitative analysis was conducted across four simulated clinical scenarios:symptom analy-sis,disease treatment,herb inquiry,and failure cases.Baseline models included GLM-4-9B-Chat,DeepSeek-V2,HuatuoGPT-II(7B),and GLM-4-9B-Chat(freeze-tuning). Results QingNangTCM achieved the highest scores in BLEU-1/2/3/4(0.425/0.298/0.137/0.064),ROUGE-1/2(0.368/0.157),and METEOR(0.218),demonstrating a balanced and supe-rior normalized performance profile of 0.900 across the dimensions of accuracy,coverage,and consistency.Although its ROUGE-L score(0.299)was lower than that of HuatuoGPT-II(7B)(0.351),it significantly outperformed domain-specific models in expert-validated win rates for professionalism(86%)and safety(73%).Qualitative analysis confirmed that the mod-el strictly adheres to the"symptom-syndrome-pathogenesis-treatment"reasoning chain,though occasional misclassifications and hallucinations persisted when dealing with rare medicinal materials and uncommon syndromes. Conclusion Combining domain-specific corpus construction with parameter-efficient prompt tuning enhances the reasoning behavior and domain adaptation of LLMs for TCM-related tasks.This work provides a technical framework for the digital organization and intel-ligent utilization of TCM knowledge,with potential value for supporting diagnostic reasoning and medical education.关键词
大语言模型/中医/微调/P-Tuning v2/临床决策支持Key words
Large language model(LLM)/Traditional Chinese medicine(TCM)/Fine-tuning/P-Tuning v2/Clinical decision support引用本文复制引用
通旭明,王雅鹏,刘利岩,袁艳红,丁晓征,贾慧茹,杨旭,严肇基,王涵,熊璋..QingNangTCM:一种面向中医领域的参数高效微调大语言模型[J].数字中医药(英文),2026,9(1):1-12,12.基金项目
Hebei Province Higher Education Scientific Research Project(QN2025367),Zhangjiakou City 2022 Municipal Science and Technology Plan Self-raised Fund Project(221105D),and Hebei Province Education Science"14th Five-Year Plan"Project(2404224). (QN2025367)