网络安全与数据治理2026,Vol.45Issue(4):1-8,8.DOI:10.19358/j.issn.2097-1788.2026.04.001
数据工厂:国家数据基础设施的新兴业态
Data Factory:an emerging form of national data infrastructure
摘要
Abstract
The valorization of data as a factor of production faces widespread challenges,including insufficient supply,restricted circulation,and ineffective utilization.The core reason lies in the immaturity of data production modes,where high-quality datasets still rely on workshop-style production that fails to meet the large-scale data demands of Artificial Intelligence(AI)large models.To address this problem,the con-cept of"Data Factory"is proposed and defined as a data infrastructure dedicated to the facility-based,large-scale,and standardized production of high-quality datasets for AI large model applications.By tracing the evolution of infrastructure forms across industrial society,information so-ciety,and data-intelligent society,the theoretical logic of Data Factory as a fundamental building block of national data infrastructure is estab-lished.Based on characteristics such as physical distribution,organizational structure,and technological sophistication,Data Factories are classified into three types:centralized,semi-centralized,and distributed.Five key features are identified:diversity,facility-orientation,scal-ability,standardization,and AI-integration.The study concludes that the development of Data Factories can effectively break through the data supply bottleneck in AI development,promote upstream and downstream collaboration in the data industry chain,and serve as a critical path to bridge the"last mile"gap between data and AI empowerment.关键词
数据工厂/数据基础设施/高质量数据集/数据要素化Key words
Data Factory/data infrastructure/high-quality dataset/data factorization分类
管理科学引用本文复制引用
张茜茜,殷宏宇,杨光..数据工厂:国家数据基础设施的新兴业态[J].网络安全与数据治理,2026,45(4):1-8,8.基金项目
北京市社会科学基金(23GLC058) (23GLC058)