| 注册
首页|期刊导航|计算机应用研究|PMoE:在P-tuning中引入混合专家的参数高效微调框架

PMoE:在P-tuning中引入混合专家的参数高效微调框架

王浩 王珺 胡海峰 周飞飞 龚锐 张索非

计算机应用研究2025,Vol.42Issue(7):1956-1963,8.
计算机应用研究2025,Vol.42Issue(7):1956-1963,8.DOI:10.19734/j.issn.1001-3695.2024.11.0484

PMoE:在P-tuning中引入混合专家的参数高效微调框架

PMoE:parameter-efficient fine-tuning framework introducing mixture of experts in P-tuning

王浩 1王珺 1胡海峰 1周飞飞 2龚锐 2张索非3

作者信息

  • 1. 南京邮电大学通信与信息工程学院,南京 210003
  • 2. 中国电信股份有限公司江苏分公司,南京 210037
  • 3. 南京邮电大学物联网学院,南京 210003
  • 折叠

摘要

Abstract

Large language model(LLM)has significantly improved performance in reasoning and generation tasks.However,existing open-source LLM still lacks sufficient domain-specific knowledge and requires fine-tuning for specialized tasks.Tradi-tional fine-tuning methods struggle to balance low cost and high efficiency in multi-task learning.To address this issue,this pa-per proposed a parameter-efficient fine-tuning framework named PMoE.Based on the P-tuning method,this framework intro-duced a mixture-of-experts mechanism to enhance multi-task processing while maintaining low-cost tuning.In each Transformer module layer,PMoE constructed trainable expert modules to replace the prompt modules in P-tuning and utilized a routing mechanism to dynamically allocate tasks based on input task features.Additionally,it designed the expert modules in PMoE to be detachable,enabling model reuse across different task scenarios and further reducing computational costs.Experimental re-sults demonstrate that PMoE achieves a 6.24%performance improvement over P-tuning on a Chinese medical dataset and ex-hibits superior capabilities in multi-task processing and transfer learning,verifying its efficiency and broad applicability.

关键词

大语言模型/参数高效微调/P-tuning/混合专家/多任务学习

Key words

large language model/parameter-efficient fine-tuning/P-tuning/mixture of experts/multi-task learning

分类

信息技术与安全科学

引用本文复制引用

王浩,王珺,胡海峰,周飞飞,龚锐,张索非..PMoE:在P-tuning中引入混合专家的参数高效微调框架[J].计算机应用研究,2025,42(7):1956-1963,8.

基金项目

国家自然科学基金资助项目(62371245) (62371245)

计算机应用研究

OA北大核心

1001-3695

访问量0
|
下载量0
段落导航相关论文