计算机应用研究2017,Vol.34Issue(7):2006-2009,2015,5.DOI:10.3969/j.issn.1001-3695.2017.07.019
基于延迟调度策略的reduce调度优化算法
Reduce scheduling optimization algorithm based on delay scheduling policy
摘要
Abstract
In large scale Hadoop cluster,good task scheduling strategy is important to improve data locality,reduce network transmission overhead,reduce job execution time and improve job throughput.In view of the low data locality problem of reduce task in Hadoop architecture,this paper put forward a reduce task scheduling optimization algorithm based on delay scheduling policy,which reduced the job execution time and improved the job throughput by improving the data locality of the reduce task.In the shuffle early phase,the algorithm used a multi-stage delay scheduling policy to improve the data locality of the reduce task.This paper rewrote the native fair scheduler code to realize the scheduling algorithm,and conducted contrast experiment with native fair scheduler.Experimental results show that the proposed algorithm significantly reduces the job execution time,and improves the job throughput.关键词
reduce任务/数据本地性/延迟调度/MapReduce任务调度Key words
reduce task/data locality/delay scheduler/MapReduce task scheduling分类
信息技术与安全科学引用本文复制引用
石义龙,林泓,李玉强,王彦..基于延迟调度策略的reduce调度优化算法[J].计算机应用研究,2017,34(7):2006-2009,2015,5.基金项目
湖北省自然科学基金资助项目(2013CFB351) (2013CFB351)