计算机工程与应用2025,Vol.61Issue(24):29-39,11.DOI:10.3778/j.issn.1002-8331.2503-0101
三维大语言模型研究进展与挑战
Progress and Challenges in 3D Large Language Model Research
摘要
Abstract
Three-dimensional large language model(3D LLM),as an important cross-modal learning approach,can not only process linguistic data but also integrate and comprehend diverse modalities such as 3D point clouds,images,and videos,promoting the development in scene understanding,reasoning,and generative tasks.With the increasing demand for spatial perception and multimodal data processing in intelligent systems,the application of 3D LLM is becoming more significant.For the development of 3D LLM,its characteristics,research directions,challenges and research objec-tives are explored.The network framework of 3D LLM is discussed,including the construction of multi-source 3D datasets,data preprocessing and feature extraction,multimodal feature fusion,model pre-training and efficient optimization strate-gies,as well as the application of a variety of downstream tasks,and the evaluation methods of 3D LLM are analyzed,covering the model comprehensive performance comparison,zero-sample learning and generalization ability analysis.Finally,the research limitations of 3D LLM are briefly described,application prospects are envisioned,and directions for future research can be proposed.关键词
三维大语言模型(3D LLM)/多模态学习/点云/神经网络/特征融合Key words
3D large language model(3D LLM)/multimodal learning/point cloud/neural network/feature fusion分类
信息技术与安全科学引用本文复制引用
GUO Ming,ZHANG Yaru,ZHU Li,WANG Guoli,HUANG Ming..三维大语言模型研究进展与挑战[J].计算机工程与应用,2025,61(24):29-39,11.基金项目
国家重点研发计划(2022YFF0904301) (2022YFF0904301)
国家自然科学基金(42171416). (42171416)