| 注册
首页|期刊导航|计算机工程|一种面向大规模知识图谱的混合存储方案

一种面向大规模知识图谱的混合存储方案

YOU Yiheng WANG Xin MA Menglu WANG Hui

计算机工程2025,Vol.51Issue(12):43-55,13.
计算机工程2025,Vol.51Issue(12):43-55,13.DOI:10.19678/j.issn.1000-3428.0252059

一种面向大规模知识图谱的混合存储方案

A Hybrid Storage Scheme for Large-scale Knowledge Graphs

YOU Yiheng 1WANG Xin 1MA Menglu 1WANG Hui1

作者信息

  • 1. College of Intelligence and Computing,Tianjin University,Tianjin 300350,China
  • 折叠

摘要

Abstract

Knowledge graphs,a crucial form of data organization in the field of artificial intelligence,are widely applied across numerous domains with the increased development of big data and large-scale models.As the scale of knowledge graphs continues to expand,existing storage structures have encountered challenges such as slow data ingestion and excessive storage space occupation.To address these issues,this paper proposes a hybrid storage scheme based on relational+key-value and designs an entity clustering algorithm based on attribute frequency.This scheme utilizes an attribute-frequency-based clustering algorithm to classify entity clusters.By combining the proposed scheme and algorithm,high-frequency attributes are stored in a relational manner and rare attributes are stored in a key-value pair manner.This design effectively mitigates the drawbacks of relational storage(such as generating excessive NULL values when handling sparse data)while reducing key duplication issues inherent in key-value storage and significantly improves storage efficiency without compromising data flexibility.Experiments on synthetic and real-world datasets show that compared to existing schemes,the proposed scheme can save over 50%of storage space on real-world datasets,increases the data ingestion speed by an order of magnitude,and this scheme has no significant impact on query performance,thus effectively solving the storage challenges of large-scale knowledge graphs,providing strong storage support for the wide application of knowledge graphs across various fields,and having important theoretical significance and practical value.

关键词

知识图谱/资源描述框架图/属性图/关系型数据库/数据存储

Key words

knowledge graph/Resource Description Framework(RDF)graph/property graph/relational database/data storage

分类

信息技术与安全科学

引用本文复制引用

YOU Yiheng,WANG Xin,MA Menglu,WANG Hui..一种面向大规模知识图谱的混合存储方案[J].计算机工程,2025,51(12):43-55,13.

基金项目

国家自然科学基金面上项目(62472311). (62472311)

计算机工程

OA北大核心

1000-3428

访问量0
|
下载量0
段落导航相关论文