首页|期刊导航|计算机应用研究|一种Spark环境下的高效率大规模图数据处理机制

一种Spark环境下的高效率大规模图数据处理机制

杨天晴王津杨旭涛张学杰

计算机应用研究2016，Vol.33Issue(12)：3730-3734,3747,6.

计算机应用研究2016，Vol.33Issue(12)：3730-3734,3747,6.DOI:10.3969/j.issn.1001-3695.2016.12.047

一种Spark环境下的高效率大规模图数据处理机制

High efficiency large-scale graph data processing mechanism in environment of Spark

杨天晴 ¹王津 ¹杨旭涛 ¹张学杰¹

作者信息

1. 云南大学信息学院，昆明650091
折叠

摘要

Abstract

Due to the inefficiency problems in processing,storage and management framework of graph data,this paper pro-posed a feasible processing mechanism of large-scale graph data.It first reviewed the advantages and shortages of existing graph processing models and graph data storage frameworks.By analyzing the characteristics of distributed computing,it im-plemented a new graph data framework including three main parts:segmentation algorithm of large-scale graph,caching and optimization for data extraction,and combination mechanism of calculation and persistence layer.By applying PageRank and SSSP algorithm,it conducted experiments to compare the performance of the proposed framework,MapReduce and Spark with HDFS.Results show that the proposed framework is more 90 times faster than MapReduce,and 2 times faster than Spark with HDFS,and the proposed framework can satisfy the needs of high performance graph data processing.

关键词

图计算/内存计算/图数据库/Hadoop/Spark/PageRank

Key words

graph computing/memory computing/graph database/Hadoop/Spark/PageRank

分类

信息技术与安全科学

引用本文复制引用

杨天晴,王津,杨旭涛,张学杰..一种Spark环境下的高效率大规模图数据处理机制[J].计算机应用研究,2016,33(12):3730-3734,3747,6.

基金项目

国家自然科学基金资助项目（）

计算机应用研究

OA北大核心CSCDCSTPCD

ISSN：1001-3695

访问量0

下载量0

段落导航