| 注册
首页|期刊导航|计算机工程与应用|基于子模式的关系数据到图数据ETL方法研究

基于子模式的关系数据到图数据ETL方法研究

丁强龙 王津 张学杰

计算机工程与应用2017,Vol.53Issue(12):76-84,9.
计算机工程与应用2017,Vol.53Issue(12):76-84,9.DOI:10.3778/j.issn.1002-8331.1605-0320

基于子模式的关系数据到图数据ETL方法研究

Research on ETL method of transforming relational data to graph data based on sub-schema

丁强龙 1王津 1张学杰1

作者信息

  • 1. 云南大学 信息学院,昆明 650091
  • 折叠

摘要

Abstract

For addressing problems such as multi-layer relational query and community detection, graph database outper-forms relational database. However, most data of existing applications have stored in the form of relationship. Therefore, how to extract-transform-load(ETL)relational data to graph data efficiently and absolutely is still an important problem of deploying graph database applications. Existing researches suffer from three major limitations:(1)The quality of con-verted graph data are poor;(2)the efficiency of transforming is low;(3)the transformed results are not suitable for dis-tributed storage. To overcome these limitations, a sub-schema-based ETL method for transforming relational data to graph data is proposed in this paper. By splitting schema of relational database to several sub-schemas, this method improves the algorithm and procedure of previous ETLs and provides an efficient way for parallel ETL. The transformed results can sat-isfy the requirements of distributed storage, and conduct to be the basis data for Spark GraphX computing framework. Fi-nally, Java EE and Neo4j are applied to implement the prototype system for experimental verification. The comparative re-sults show that the improved ETL method yields better performance than previous methods.

关键词

图数据库/分布式存储/ETL(数据提取、转换和加载)/子模式

Key words

graph database/distributed storage/extract-transform-load(ETL)/sub-schema

分类

信息技术与安全科学

引用本文复制引用

丁强龙,王津,张学杰..基于子模式的关系数据到图数据ETL方法研究[J].计算机工程与应用,2017,53(12):76-84,9.

基金项目

国家自然科学基金(No.61170222). (No.61170222)

计算机工程与应用

OA北大核心CSCDCSTPCD

1002-8331

访问量0
|
下载量0
段落导航相关论文