微型电脑应用2026,Vol.42Issue(1):1-3,3.
面向AI训练的数据存储技术研究
Research on Data Storage Technology for AI Training
摘要
Abstract
The development of data-driven artificial intelligence(AI)is advancing rapidly,with the underlying layer performing training and optimizing algorithms through large amounts of data.The parameter scale of AI large models has reached billions,evolving from single modal to multimodal,involving types such as text,language,and video,and the data scale has grown from tens of terabytes to petabytes.The increasing size of model parameters means that the efficiency requirements for training become higher,and improving storage performance becomes an important direction to enhance the training efficiency of large models.The paper analyzes the storage access requirements of different stages of the large models,as well as the matching sit-uation of different distributed storage for the training of the large models,and explores and proposes directions for improving the performance of distributed storage.关键词
人工智能/分布式存储/缓存/元数据Key words
artificial intelligence/distributed storage/cache/meta data分类
信息技术与安全科学引用本文复制引用
尤丽珏,陈洁,袁文彬,陈裔,单蓉胜..面向AI训练的数据存储技术研究[J].微型电脑应用,2026,42(1):1-3,3.基金项目
国家卫生健康委医院管理研究所医疗质量(循证)管理研究项目(YLZLXZ23G017) (循证)
上海申康医院发展中心临床研究数据共享和模拟RCT项目CRU协同数据质量提升项目(SHDC2024CRX028) (SHDC2024CRX028)