计算机工程与应用2026,Vol.62Issue(7):53-69,17.DOI:10.3778/j.issn.1002-8331.2505-0035
设备端大语言模型研究综述
Survey of Research On-Device Large Language Models
摘要
Abstract
The development of large language models(LLMs)technology is pushing the boundaries of natural language processing and artificial intelligence.However,cloud-based deployment of LLMs faces significant challenges,including latency issues,privacy risks,network dependency,and high resource consumption.To address these issues,on-device LLMs have gradually become a research hotspot,aiming to localized deployment on resource-constrained devices.This shift enh-ances real-time responsiveness,strengthens data privacy,and improves energy efficiency.The article surveys the current state of research on on-device large language models by focusing on two dimensions:model lightweighting and system-level support.Firstly,it surveys current mainstream model compression methods,including quantization,pruning,knowledge distillation and low-rank decomposition.Secondly,it discusses model architecture innovations tailored for on-device scenarios,including mixture-of-experts approaches and hierarchical designs.Additionally,it covers key system-level technologies,such as lightweight inference engines,specialized AI chips for edge devices,and memory optimization strategies.Finally,the paper highlights the major challenges and outlines potential future directions for advancing on-device large language models.关键词
设备端大语言模型/模型压缩/轻量化框架/硬件加速Key words
on-device large language models/model compression/lightweight framework/hardware acceleration分类
信息技术与安全科学引用本文复制引用
赵家富,柳林,王海龙,牛天元,刘静..设备端大语言模型研究综述[J].计算机工程与应用,2026,62(7):53-69,17.基金项目
国家自然科学基金(62566047) (62566047)
内蒙古自治区自然科学基金(2023LHMS06006,2024LHMS06015) (2023LHMS06006,2024LHMS06015)
基于机器学习的智能碳排放管理系统开发项目(20240043C) (20240043C)
人工智能+教育资源优化布局平台建设项目(3203002507). (3203002507)