中国医学教育技术2025,Vol.39Issue(6):760-766,7.DOI:10.13566/j.cnki.cmet.cn61-1317/g4.202506012
DeepSeek在中国临床执业医师资格模拟考试中的应用研究
Research on the application of DeepSeek in the simulated China National Medical Licensing Examination
摘要
Abstract
Objective To assess the performance of DeepSeek-R1,a deep generative pre-trained model,in the simulated Chinese National Medical Licensing Examination and explore its strengths and limitations.Methods A examination paper with 300 simulated questions convering vari-ous formats and disciplinary knowledge was input into DeepSeek-R1 and its responses were recorded.Incorrect answers were challenged to evaluate its ability to correct errors.Results DeepSeek-R1 achieved an overall accuracy of 94.3%(283/300),and after challenges,accuracy rose to 97.3%(292/300).Accuracy varied significantly across question types(P<0.05),with A2-type scoring the lowest(84.3%).Clinical and non-clinical question accuracies were 93.1%and 97.6%,respectively,with no significant difference.Performance was slightly better on lower-order(97.1%)than higher-order ques-tions(92.8%),without statistical significance.The model showed high confidence(100%)in all re-sponses,including incorrect ones.Conclusion DeepSeek-R1 demonstrated strong medical knowl-edge and reasoning skills,indicating potential for use in medical education and clinical support.How-ever,its weaker performance on complex cases and unjustified confidence in errors highlight risks of AI hallucination.Human oversight remains essential for safe and effective application.关键词
DeepSeek/继续教育/中国临床执业医师资格考试/AI幻觉/自然语言处理Key words
DeepSeek/continuing education/China National Medical Licensing Examina-tion/AI hallucination/natural language processing分类
教育学引用本文复制引用
李蒙,金岑,金琦,单连弋,温馨,李健维,张雪,张丽..DeepSeek在中国临床执业医师资格模拟考试中的应用研究[J].中国医学教育技术,2025,39(6):760-766,7.基金项目
中央高水平医院临床科研业务费资助 ()