西安电子科技大学学报(自然科学版)Issue(2):191-196,6.DOI:10.3969/j.issn.1001-2400.2014.02.031
一种适用于Hadoop MapReduce环境的数据预取方法
Prefetching method for Hadoop MapReduce environments
摘要
Abstract
Due to the data dependency and the special task execution mode in MapReduce environments, reduce tasks always cause massive remote data access delay and unnecessary resource competition,which degrades the system performance.To solve the performance problem,we propose a pre-fetching method based on pre-scheduling.The method hides the remote data access delay by pre-fetching,and controls the resource competition by adjusting resource allocation of reduce tasks.The method is implemented in Hadoop-0.20.2.The experimental results show that the method improves the system performance by more than 10%,compared with default Hadoop MapReduce and Hadoop Online Prototype.关键词
MapReduce/分布式计算/预取/调度Key words
MapReduce/distributed computing/pre-fetching/scheduling分类
信息技术与安全科学引用本文复制引用
张霄宏,雒芬,贾宗璞,沈记全..一种适用于Hadoop MapReduce环境的数据预取方法[J].西安电子科技大学学报(自然科学版),2014,(2):191-196,6.基金项目
国家自然科学基金资助项目(51274088) (51274088)
河南省教育厅资助项目(ITE12103) (ITE12103)
河南理工大学博士基金资助项目(B2012-099) (B2012-099)
河南理工大学矿山信息化省级重点实验室资助项目(KY2012-05) (KY2012-05)