| 注册
首页|期刊导航|四川大学学报(自然科学版)|面向工艺设计的领域大模型构建方法

面向工艺设计的领域大模型构建方法

刘祥根 郭彦 李玥 史建成 刘文 邓洪波 孙晨伟 李阳 吕建成

四川大学学报(自然科学版)2025,Vol.62Issue(3):513-521,9.
四川大学学报(自然科学版)2025,Vol.62Issue(3):513-521,9.DOI:10.19907/j.0490-6756.250087

面向工艺设计的领域大模型构建方法

A method for constructing Domain-Specific large language models oriented to manufacturing process design

刘祥根 1郭彦 1李玥 2史建成 3刘文 3邓洪波 3孙晨伟 1李阳 3吕建成1

作者信息

  • 1. 四川大学计算机学院,成都 610065
  • 2. 东方电气(成都)创新研究有限公司,成都 610213
  • 3. 西南电子设备研究所,成都 610036
  • 折叠

摘要

Abstract

With the increasing demand for intelligent support in manufacturing process design,approaches for constructing domain-specific large language models(LLMs)have emerged as a key research focus.Although the development of LLMs has achieved remarkable success in a wide range of natural language processing tasks,their direct applicability to the manufacturing process design domain remains limited.This is primarily due to the scarcity of training samples,the complexity of data formats,and the lack of structured annotations needed for supervised learning.Moreover,existing attention mechanisms face challenges such as high compu-tational complexity,high resource consumption,and unstable global semantics when processing long and complex texts,further limiting the adaptability of LLMs to the downstream tasks in industry.To address these challenges,this study proposes a domain-specific LLM construction method tailored to manufacturing process design and presents Luban-10B,a 10-billion-parameter transformer-based language model.This method introduces hybrid sparse attention by preserving attention weights on initial identifiers and dynami-cally selecting the top-K most relevant historical tokens based on the query.By avoiding dense attention over the entire sequence,it significantly reduces computational complexity while enhancing the model's ability to capture key information in long texts.Experimental results show that Luban-10B effectively enhances the adaptability and generative performance of domain-specific LLMs in manufacturing process design,offering a new technological path and support for intelligent design of the manufacturing process.

关键词

大规模语言模型/注意力机制/长文本生成/工艺设计

Key words

Large language model/Attention mechanism/Long-text generation/Process design

分类

信息技术与安全科学

引用本文复制引用

刘祥根,郭彦,李玥,史建成,刘文,邓洪波,孙晨伟,李阳,吕建成..面向工艺设计的领域大模型构建方法[J].四川大学学报(自然科学版),2025,62(3):513-521,9.

基金项目

国家重点研发计划(2024YFB3312503) (2024YFB3312503)

四川省自然科学基金(2024NSFTD0048) (2024NSFTD0048)

四川省重大专项(2024ZDZX0003) (2024ZDZX0003)

四川大学学报(自然科学版)

OA北大核心

0490-6756

访问量6
|
下载量0
段落导航相关论文