现代信息科技2025,Vol.9Issue(11):59-63,69,6.DOI:10.19850/j.cnki.2096-4706.2025.11.012
关于大语言模型一体化评测的研究和实践
Research and Practice on Integrated Evaluation of Large Language Models
何奇 1韩啸 1茆昊天 1仇建民1
作者信息
- 1. 中国电信股份有限公司 江苏分公司,江苏 南京 210037
- 折叠
摘要
Abstract
With the increasing application of LLMs,how to accurately,objectively and comprehensively evaluate the ability of large models has become an important topic of common concern in academia and industry.In recent years,Jiangsu Telecom has actively carried out the exploration and practice of LLMs,and reconstructed multiple applications in the BMO domains through large models.This paper introduces the integrated evaluation scheme and system practice of Jiangsu Telecom based on the current open source big model ecology.This scheme can agilely access the latest released open source large models,and realize the blind test selection of large models based on practical applications,providing a useful reference for building a more scientific and perfect Large Language Model evaluation system.关键词
大语言模型/评测/框架Key words
LLMs/evaluation/framework分类
信息技术与安全科学引用本文复制引用
何奇,韩啸,茆昊天,仇建民..关于大语言模型一体化评测的研究和实践[J].现代信息科技,2025,9(11):59-63,69,6.