大数据2026,Vol.12Issue(2):64-74,11.DOI:10.11959/j.issn.2096-0271.2026036
高质量数据集产品的形态和生产流程研究
Research on the form and production process of high-quality dataset products
杨琳 1朱扬勇2
作者信息
- 1. 上海市大数据中心,上海 200003||华东师范大学数据科学与工程学院,上海 200062
- 2. 上海数据研究院有限公司,上海 200120
- 折叠
摘要
Abstract
High-quality datasets determine the training performance of artificial intelligence models.The lack of a unified standard form and a quality-controllable,process-based production method for high-quality datasets has led to their insufficient supply and inefficient circulation,which has become a bottleneck restricting the development and application of artificial intelligence.From the perspective of data products,this paper proposes a five-tuple form for high-quality dataset products.Supported by full-link technical capabilities,we design a production process for such products and propose a product-oriented quality control method that covers the entire production chain.This work provides a theoretical basis and feasible solutions for the large-scale production and circulation of high-quality dataset products.关键词
高质量数据集/数据产品/形态/生产流程/数据质量Key words
high-quality dataset/data product/form/production process/data quality分类
信息技术与安全科学引用本文复制引用
杨琳,朱扬勇..高质量数据集产品的形态和生产流程研究[J].大数据,2026,12(2):64-74,11.