集成技术2016,Vol.5Issue(3):60-71,12.
异构大数据编程环境 Hadoop+
Hadoop+:A Big-data Programming Framework for Heterogeneous Computing Environments
摘要
Abstract
The rapid development of Internet and Internet of Things opens the era of big data. Currently, heterogeneous architectures are being widely adopted in large-scale datacenters, for the sake of performance improvement and reduction of energy consumption. This paper presents the design and implementation of Hadoop+, a programming framework that implements MapReduce and enables invocation of parallelized CUDA/OpenCL within a map/reduce task, and helps the user by taking advantage of a heterogeneous task model. Experimental result shows that Hadoop+ attains 1.4× to 16.1× speedups over Hadoop for five commonly used machine learning algorithms. Coupled with a heterogeneous task model that helps allocate computing resouces, Hadoop+ brings a 36.0% improvement in data processing speed for single-application workloads, and for mixed workloads of multiple applications, the execution time is reduced by up to 36.9%with an average 17.6%.关键词
异构/数据中心/Hadoop+/MapReduceKey words
heterogeneous/datacenter/Hadoop+/MapReduce分类
信息技术与安全科学引用本文复制引用
何文婷,崔慧敏,冯晓兵..异构大数据编程环境 Hadoop+[J].集成技术,2016,5(3):60-71,12.基金项目
国家重点基础研究发展计划(973)(2011CB302504);国家高技术研究发展计划(863)(2012AA010902、2015AA011505);国家自然科学基金(61202055、61221062、61303053、61432016、61402445) (973)