首页|期刊导航|口腔疾病防治|大语言模型在儿童口腔预防医学领域问答的准确性比较

大语言模型在儿童口腔预防医学领域问答的准确性比较

管伯颜许明鹤张惠淇马舒蕾张珊珊赵俊峰

口腔疾病防治2025，Vol.33Issue(4)：313-319,7.

口腔疾病防治2025，Vol.33Issue(4)：313-319,7.DOI:10.12016/j.issn.2096-1456.202440370

大语言模型在儿童口腔预防医学领域问答的准确性比较

Accuracy of large language models for answering pediatric preventive dentistry questions

管伯颜 ¹许明鹤 ¹张惠淇 ¹马舒蕾 ¹张珊珊 ²赵俊峰³

作者信息

1. 北京大学口腔医学院,北京(100081)
2. 北京大学口腔医院口腔预防保健科,北京(100081)
3. 北京大学计算机学院,北京(100871)||高可信软件技术教育部重点实验室,北京(100871)
折叠

摘要

Abstract

Objective To evaluate and compare the accuracy of responses to pediatric preventive dentistry-related questions between the domestic large language model,ChatGLM-6B,and the international large language model,ChatG-PT-3.5,in order to provide insights for further research and development of domestic language models in the field of oral medicine.Methods A total of 100 common pediatric preventive dentistry questions of varying difficulty levels[ba-sic(n=35),intermediate(n=35),and advanced(n=30)]were provided by pediatric preventive dentistry experts.Two doctors independently registered these questions with ChatGPT-3.5 and ChatGLM-6B and collected the answers.A co-hort of 16 dentists assessed responses generated by ChatGLM-6B and ChatGPT-3.5 using a predefined 3-point Likert scale.The average score of the ratings from 16 doctors was taken as the answer score.If the answer score was higher than 2.8,it was accepted as a accurate answer;if the score was lower than 1.4,it was accepted as an inaccurate answer;if the score was between 1.4 and 2.8,it was accepted as a partially accurate answer.Comparative analysis was conduct-ed on the accuracy rates and evaluation outcomes between the two groups.Consistency analysis of the ratings was con-ducted.Results The answer accuracy rates of ChatGPT-3.5 and ChatGLM-6B for 100 pediatric preventive dentistry questions were comparable:ChatGPT-3.5 demonstrated 68%accurate,30%partially accurate,and 2%inaccurate re-sponses,while ChatGLM-6B showed 67%accurate,31%partially accurate,and 2%inaccurate responses,with no statis-tically significant differences(P＞0.05).Both models exhibited equivalent accuracy across questions of varying difficul-ty levels(basic,intermediate,advanced),showing no statistical differences(P＞0.05).The overall average scores for ChatGPT3.5 and ChatGLM-6B in answering all questions were both 2.65,with no statistically significant difference(P＞0.05).For questions of different difficulty levels,ChatGPT3.5 had an average score of 2.66 for basic questions while ChatGLM-6B had an average score of 2.70.For intermediate questions,ChatGPT3.5 had an average score of 2.63 and ChatGLM-6B had an average score of 2.64.For advanced questions,ChatGPT3.5 had an average score of 2.68,and ChatGLM-6B had an average score of 2.61.No statistically significant differences were observed across any difficulty category(P＞0.05).The consistency of the experts’grading ranged from fair to moderate.Conclusion This study demonstrates the potential of both ChatGLM-6B and ChatGPT-3.5 in answering pediatric preventive dentistry questions.ChatGLM-6B performed similarly to ChatGPT-3.5 in this field,but the accuracy rates of both models fell short of expec-tations and are not suitable for clinical use.Future efforts should focus on improving the accuracy and consistency of large language models in providing medical information,as well as developing specialized medical models for the field of oral medicine.

关键词

大语言模型/儿童口腔医学/口腔预防医学/口腔医学/ChatGPT/人工智能/聊天机器人/医学

Key words

large language model/pediatric stomatology/preventive dentistry/stomatology/ChatGPT/arti-ficial intelligence/Chatbot/medicine

分类

医药卫生

引用本文复制引用

管伯颜,许明鹤,张惠淇,马舒蕾,张珊珊,赵俊峰..大语言模型在儿童口腔预防医学领域问答的准确性比较[J].口腔疾病防治,2025,33(4):313-319,7.

基金项目

首都卫生发展科研专项项目(2024-2G-4106) （2024-2G-4106）

2023年度人卫创新发展研究项目(RWCY23DⅡ007007) （RWCY23DⅡ007007）

北京大学口腔医学院教育教学研究项目(2024-ZC-07) This study was supported by the grants from Capital's Funds for Health Improvement and Research(No.2024-2G-4106) （2024-ZC-07）

The Project Sponsored by the Innovative Development Research of People’s Medical Publishing House(No.RW-CY23DⅡ007007) （No.RW-CY23DⅡ007007）

Education Research Project of Peking University School and Hospital of Stomatology in 2024(No.2024-ZC-07). （No.2024-ZC-07）

口腔疾病防治

ISSN：1006-5245

访问量1

下载量0

段落导航