四川大学学报(自然科学版)2025,Vol.62Issue(3):513-521,9.DOI:10.19907/j.0490-6756.250087
面向工艺设计的领域大模型构建方法
A method for constructing Domain-Specific large language models oriented to manufacturing process design
摘要
Abstract
With the increasing demand for intelligent support in manufacturing process design,approaches for constructing domain-specific large language models(LLMs)have emerged as a key research focus.Although the development of LLMs has achieved remarkable success in a wide range of natural language processing tasks,their direct applicability to the manufacturing process design domain remains limited.This is primarily due to the scarcity of training samples,the complexity of data formats,and the lack of structured annotations needed for supervised learning.Moreover,existing attention mechanisms face challenges such as high compu-tational complexity,high resource consumption,and unstable global semantics when processing long and complex texts,further limiting the adaptability of LLMs to the downstream tasks in industry.To address these challenges,this study proposes a domain-specific LLM construction method tailored to manufacturing process design and presents Luban-10B,a 10-billion-parameter transformer-based language model.This method introduces hybrid sparse attention by preserving attention weights on initial identifiers and dynami-cally selecting the top-K most relevant historical tokens based on the query.By avoiding dense attention over the entire sequence,it significantly reduces computational complexity while enhancing the model's ability to capture key information in long texts.Experimental results show that Luban-10B effectively enhances the adaptability and generative performance of domain-specific LLMs in manufacturing process design,offering a new technological path and support for intelligent design of the manufacturing process.关键词
大规模语言模型/注意力机制/长文本生成/工艺设计Key words
Large language model/Attention mechanism/Long-text generation/Process design分类
信息技术与安全科学引用本文复制引用
刘祥根,郭彦,李玥,史建成,刘文,邓洪波,孙晨伟,李阳,吕建成..面向工艺设计的领域大模型构建方法[J].四川大学学报(自然科学版),2025,62(3):513-521,9.基金项目
国家重点研发计划(2024YFB3312503) (2024YFB3312503)
四川省自然科学基金(2024NSFTD0048) (2024NSFTD0048)
四川省重大专项(2024ZDZX0003) (2024ZDZX0003)