| 注册
首页|期刊导航|大数据|面向教育场景的视觉大模型优化与应用

面向教育场景的视觉大模型优化与应用

许跃蓬 徐柴迪 郭晋军 姜云桥 王仕嘉 刘垚

大数据2025,Vol.11Issue(5):67-85,19.
大数据2025,Vol.11Issue(5):67-85,19.DOI:10.11959/j.issn.2096-0271.2025056

面向教育场景的视觉大模型优化与应用

Optimization and application of vision-based large models in educational scenarios

许跃蓬 1徐柴迪 2郭晋军 3姜云桥 4王仕嘉 3刘垚5

作者信息

  • 1. 华东师范大学数据科学与工程学院,上海 200062||先进计算与智能工程国家级重点实验室,江苏 无锡 214000
  • 2. 上海孔棣科技有限公司,上海 201204
  • 3. 华东师范大学数据科学与工程学院,上海 200062
  • 4. 摩尔线程智能科技(成都)有限公司,四川 成都 610031
  • 5. 华东师范大学数据科学与工程学院,上海 200062||先进计算与智能工程国家级重点实验室,江苏 无锡 214000||软硬件协同设计技术与应用教育部工程研究中心(华东师范大学),上海 200062
  • 折叠

摘要

Abstract

With the rapid advancement of artificial intelligence technology,LLMs have achieved significant success across various fields.However,their application in the field of education domain still faces challenges such as difficulties in processing multimodal data,insufficient response accuracy,and limited information delivery methods.To address these issues,a VELM was proposed.VELM was trained on multimodal public educational datasets and specialized educational datasets,and combined with model optimization techniques,VELM not only enhances response quality in educational scenarios but also optimizes and reduces computational resource consumption.Additionally,RAG technology was utilized to ensure accuracy and richness in generated content.In terms of deployment and application,VELM was implemented through the Dify platform,enabling flexible multi-end deployment,including WeChat mini programs,Web cloud platforms,and localized deployment,meeting the diverse needs of different educational scenarios.Evaluation experiments demonstrated that VELM significantly outperformed open-source large models such as MiniCPM-V,DeepSeek-VL,and Yi-VL on standard benchmark datasets like Mathvista,OCRBench,and MMMU.On specialized educational evaluation datasets,the accuracy of VELM was improved by 9.78%compared to the base model Qwen2-VL.

关键词

大语言模型/多模态/智慧教育/RAG技术

Key words

large language model/multimodal/smart education/RAG technology

分类

信息技术与安全科学

引用本文复制引用

许跃蓬,徐柴迪,郭晋军,姜云桥,王仕嘉,刘垚..面向教育场景的视觉大模型优化与应用[J].大数据,2025,11(5):67-85,19.

基金项目

国家自然科学基金项目(No.42375146) (No.42375146)

国家重大科技基础设施项目(No.2024-EL-PT-000737) (No.2024-EL-PT-000737)

先进计算与智能工程国家级重点实验室基金项目(No.2023-LYJJ-01-006) (No.2023-LYJJ-01-006)

光合基金项目(No.202407013820) The National Natural Science Foundation of China(No.42375146),The National Key Scientific and Technological Infrastructure Project(No.2024-EL-PT-000737),The Fund of Laboratory for Advanced Computing and Intelligence Engineering(No.2023-LYJJ-01-006),GHfund A(No.202407013820) (No.202407013820)

大数据

2096-0271

访问量0
|
下载量0
段落导航相关论文