计算机工程与应用2025,Vol.61Issue(22):20-35,16.DOI:10.3778/j.issn.1002-8331.2502-0161
算力约束下混合专家模型计算优化方法:现状及研究进展
Algorithm Optimization Method for Mixture of Experts Under Computational Power Constraints:Status and Progress
摘要
Abstract
Large language models have achieved remarkable results in natural language processing and other fields in recent years.Mixture of experts(MoE)reduces computational demands by employing sparse activation strategies.As the infer-ence tasks faced by MoE grow increasingly complex,expert models deployed on edge devices often encounter resource requirements exceeding node computing power.Consequently,computational optimization of MoE under computational constraints has become a persistent research focus in the field.This paper introduces the concept and architecture of MoE,and categorizes and reviews relevant optimization methods across three dimensions:gating networks,expert structures and models,and memory management.At the gating network level,encompasses routing design,loss function optimization,and load balancing mechanisms are studied to achieve precise routing.At the expert structure level,structural innovations are summarized,including various expert designs,preprocessing methods,and expert merging strategies.At the memory management level,existing parameter compression and memory offloading techniques are reviewed to address resource constraints during model deployment.This paper analyzes the principles,strategies,and key technical challenges of com-putational optimization across different dimensions,and then identifies critical issues and potential research opportunities that need to be focused on.关键词
混合专家模型(MoE)/计算优化/负载均衡/专家结构/内存管理Key words
mixture of experts(MoE)/computational optimization/load balancing/expert structure/memory management分类
计算机与自动化引用本文复制引用
问佳琳,李晓军,姚俊萍,辜弘炀..算力约束下混合专家模型计算优化方法:现状及研究进展[J].计算机工程与应用,2025,61(22):20-35,16.基金项目
国家自然科学基金(62401609) (62401609)
陕西省自然科学基础研究计划(2024JC-YBQN-0628) (2024JC-YBQN-0628)
中国博士后科学基金(2024M754275). (2024M754275)