首页|期刊导航|计算机工程与应用|设备端大语言模型研究综述

设备端大语言模型研究综述

赵家富柳林王海龙牛天元刘静

计算机工程与应用2026，Vol.62Issue(7)：53-69,17.

计算机工程与应用2026，Vol.62Issue(7)：53-69,17.DOI:10.3778/j.issn.1002-8331.2505-0035

设备端大语言模型研究综述

Survey of Research On-Device Large Language Models

赵家富 ¹柳林 ¹王海龙 ¹牛天元 ¹刘静²

作者信息

1. 内蒙古师范大学计算机科学技术学院,呼和浩特 010022||内蒙古师范大学计算科学联合创新实验室,呼和浩特 010022
2. 内蒙古大学图书馆,呼和浩特 010021
折叠

摘要

Abstract

The development of large language models(LLMs)technology is pushing the boundaries of natural language processing and artificial intelligence.However,cloud-based deployment of LLMs faces significant challenges,including latency issues,privacy risks,network dependency,and high resource consumption.To address these issues,on-device LLMs have gradually become a research hotspot,aiming to localized deployment on resource-constrained devices.This shift enh-ances real-time responsiveness,strengthens data privacy,and improves energy efficiency.The article surveys the current state of research on on-device large language models by focusing on two dimensions:model lightweighting and system-level support.Firstly,it surveys current mainstream model compression methods,including quantization,pruning,knowledge distillation and low-rank decomposition.Secondly,it discusses model architecture innovations tailored for on-device scenarios,including mixture-of-experts approaches and hierarchical designs.Additionally,it covers key system-level technologies,such as lightweight inference engines,specialized AI chips for edge devices,and memory optimization strategies.Finally,the paper highlights the major challenges and outlines potential future directions for advancing on-device large language models.

关键词

设备端大语言模型/模型压缩/轻量化框架/硬件加速

Key words

on-device large language models/model compression/lightweight framework/hardware acceleration

分类

信息技术与安全科学

引用本文复制引用

赵家富,柳林,王海龙,牛天元,刘静..设备端大语言模型研究综述[J].计算机工程与应用,2026,62(7):53-69,17.

基金项目

国家自然科学基金(62566047) （62566047）

内蒙古自治区自然科学基金(2023LHMS06006,2024LHMS06015) （2023LHMS06006,2024LHMS06015）

基于机器学习的智能碳排放管理系统开发项目(20240043C) （20240043C）

人工智能+教育资源优化布局平台建设项目(3203002507). （3203002507）

计算机工程与应用

ISSN：1002-8331

访问量1

下载量0

段落导航