计算机应用研究2025,Vol.42Issue(10):2947-2955,9.DOI:10.19734/j.issn.1001-3695.2025.04.0082
基于不确定性估计的微调代码生成模型与大语言模型的协同方法
Coordination of fine-tuned code generation models and large language model via uncertainty estimation
摘要
Abstract
The complementary mechanism between fine-tuned code generation models and large LLM remains underexplored,leading to ambiguous decision boundaries in their collaboration.This paper proposed a method named Coral to coordinate fine-tuned models and LLMs based on uncertainty estimation.This method analyzed the complementarity between the two models and quantified their decision boundaries.Coral adopted the concept of expected calibration error to compare uncertainty esti-mation methods and selected a stable method for the fine-tuned model.This enabled the fine-tuned model to output uncertainty scores reflecting prediction confidence.Coral calculated an uncertainty threshold by maximizing BLEU scores on a validation dataset,which quantified the decision boundary between the models.Based on the threshold and the uncertainty scores,the method identified ID and OOD data.The LLM handled OOD data to improve the generalization of the fine-tuned model.Evalu-ation on two benchmark datasets show that Coral outperforms the use of either model alone in both BLEU and Exact Match me-trics.The results indicate that Coral effectively coordinates the fine-tuned model and LLM.关键词
大语言模型/微调模型/代码生成/不确定性估计Key words
large language model/fine-tuned model/code generation/uncertainty estimation分类
信息技术与安全科学引用本文复制引用
洪少东,申国伟,罗素芬,刘涛..基于不确定性估计的微调代码生成模型与大语言模型的协同方法[J].计算机应用研究,2025,42(10):2947-2955,9.基金项目
国家自然科学基金资助项目(62062022) (62062022)
贵州省省级科技计划资助项目(黔科合基础-ZK[2023]重点011) (黔科合基础-ZK[2023]重点011)
贵州省科技支撑计划资助项目(黔科合支撑[2023]一般447) (黔科合支撑[2023]一般447)