| 注册
首页|期刊导航|计算机工程与应用|设备端大语言模型研究综述

设备端大语言模型研究综述

赵家富 柳林 王海龙 牛天元 刘静

计算机工程与应用2026,Vol.62Issue(7):53-69,17.
计算机工程与应用2026,Vol.62Issue(7):53-69,17.DOI:10.3778/j.issn.1002-8331.2505-0035

设备端大语言模型研究综述

Survey of Research On-Device Large Language Models

赵家富 1柳林 1王海龙 1牛天元 1刘静2

作者信息

  • 1. 内蒙古师范大学 计算机科学技术学院,呼和浩特 010022||内蒙古师范大学 计算科学联合创新实验室,呼和浩特 010022
  • 2. 内蒙古大学 图书馆,呼和浩特 010021
  • 折叠

摘要

Abstract

The development of large language models(LLMs)technology is pushing the boundaries of natural language processing and artificial intelligence.However,cloud-based deployment of LLMs faces significant challenges,including latency issues,privacy risks,network dependency,and high resource consumption.To address these issues,on-device LLMs have gradually become a research hotspot,aiming to localized deployment on resource-constrained devices.This shift enh-ances real-time responsiveness,strengthens data privacy,and improves energy efficiency.The article surveys the current state of research on on-device large language models by focusing on two dimensions:model lightweighting and system-level support.Firstly,it surveys current mainstream model compression methods,including quantization,pruning,knowledge distillation and low-rank decomposition.Secondly,it discusses model architecture innovations tailored for on-device scenarios,including mixture-of-experts approaches and hierarchical designs.Additionally,it covers key system-level technologies,such as lightweight inference engines,specialized AI chips for edge devices,and memory optimization strategies.Finally,the paper highlights the major challenges and outlines potential future directions for advancing on-device large language models.

关键词

设备端大语言模型/模型压缩/轻量化框架/硬件加速

Key words

on-device large language models/model compression/lightweight framework/hardware acceleration

分类

信息技术与安全科学

引用本文复制引用

赵家富,柳林,王海龙,牛天元,刘静..设备端大语言模型研究综述[J].计算机工程与应用,2026,62(7):53-69,17.

基金项目

国家自然科学基金(62566047) (62566047)

内蒙古自治区自然科学基金(2023LHMS06006,2024LHMS06015) (2023LHMS06006,2024LHMS06015)

基于机器学习的智能碳排放管理系统开发项目(20240043C) (20240043C)

人工智能+教育资源优化布局平台建设项目(3203002507). (3203002507)

计算机工程与应用

1002-8331

访问量1
|
下载量0
段落导航相关论文