首页|期刊导航|东南大学学报（自然科学版）|并行计算框架Spark的自动检查点策略

并行计算框架Spark的自动检查点策略

英昌甜于炯卞琛鲁亮钱育蓉

东南大学学报（自然科学版）2017，Vol.47Issue(2)：231-235,5.

东南大学学报（自然科学版）2017，Vol.47Issue(2)：231-235,5.DOI:10.3969/j.issn.1001-0505.2017.02.006

并行计算框架Spark的自动检查点策略

Automatic checkpoint strategy for parallel computing frame with Spark

英昌甜 ¹于炯 ¹卞琛 ²鲁亮 ¹钱育蓉¹

作者信息

1. 新疆大学信息科学与工程学院,乌鲁木齐830046
2. 新疆大学软件学院,乌鲁木齐830008
折叠

摘要

Abstract

The existing Spark checkpoint mechanism required the programmer to choose the checkpoint according to the experience,thus it had a certain risk and randomness,resulting in large recovery overhead.To address this problem,the resilient distribution datasets (RDD) characteristics were analyzed,and the weight generated (WG) algorithm and checkpoint automatic selection (CAS) algorithm were put forward.First,in the WG algorithm,the directed acyclic graph (DAG) of the job was analyzed,and the lineage length and the operation complexity of RDD were obtained to compute the RDD weight.Secondly,in the CAS algorithm,the RDD with the maximum weight was selected for setting checkpoints asynchronously to fast recovery.The experimental results show that comparing with the original Spark,the execution time and the checkpoint size of different datasets are increased by the CAS algorithm,while the increasing extent of Wiki-Talk is more obvious.For the single node failure recovery,the datasets have smaller recovery overhead after setting checkpoint by using the CAS algorithm.Therefore,the strategy can efficiently decrease the recovery overhead of jobs with sacrificing the slight extra overhead.

关键词

自动检查点/RDD权重/Spark/恢复时间

Key words

automatic checkpoint/resilient distribution dataset (RDD) weight/Spark/recovery time

分类

信息技术与安全科学

引用本文复制引用

英昌甜,于炯,卞琛,鲁亮,钱育蓉..并行计算框架Spark的自动检查点策略[J].东南大学学报（自然科学版）,2017,47(2):231-235,5.

基金项目

国家自然科学基金资助项目(61462079,61262088,61562086,61363083,61562078)、新疆维吾尔自治区高校科研计划资助项目(XJEDU2016S106). （61462079,61262088,61562086,61363083,61562078）

东南大学学报（自然科学版）

OA北大核心CSCDCSTPCD

ISSN：1001-0505

访问量0

下载量0

段落导航