计算机应用研究2016,Vol.33Issue(12):3730-3734,3747,6.DOI:10.3969/j.issn.1001-3695.2016.12.047
一种Spark环境下的高效率大规模图数据处理机制
High efficiency large-scale graph data processing mechanism in environment of Spark
摘要
Abstract
Due to the inefficiency problems in processing,storage and management framework of graph data,this paper pro-posed a feasible processing mechanism of large-scale graph data.It first reviewed the advantages and shortages of existing graph processing models and graph data storage frameworks.By analyzing the characteristics of distributed computing,it im-plemented a new graph data framework including three main parts:segmentation algorithm of large-scale graph,caching and optimization for data extraction,and combination mechanism of calculation and persistence layer.By applying PageRank and SSSP algorithm,it conducted experiments to compare the performance of the proposed framework,MapReduce and Spark with HDFS.Results show that the proposed framework is more 90 times faster than MapReduce,and 2 times faster than Spark with HDFS,and the proposed framework can satisfy the needs of high performance graph data processing.关键词
图计算/内存计算/图数据库/Hadoop/Spark/PageRankKey words
graph computing/memory computing/graph database/Hadoop/Spark/PageRank分类
信息技术与安全科学引用本文复制引用
杨天晴,王津,杨旭涛,张学杰..一种Spark环境下的高效率大规模图数据处理机制[J].计算机应用研究,2016,33(12):3730-3734,3747,6.基金项目
国家自然科学基金资助项目 ()