| 注册
首页|期刊导航|计算机工程|基于磁盘I/O性能的Hadoop任务选择策略

基于磁盘I/O性能的Hadoop任务选择策略

李强 孙震宇 雷晓凤 孙功星

计算机工程2016,Vol.42Issue(11):76-82,7.
计算机工程2016,Vol.42Issue(11):76-82,7.DOI:10.3969/j.issn.1000-3428.2016.11.013

基于磁盘I/O性能的Hadoop任务选择策略

Hadoop Task Selection Strategy Based on Disk I/O Performance

李强 1孙震宇 2雷晓凤 1孙功星2

作者信息

  • 1. 中国科学院高能物理研究所,北京 100049
  • 2. 中国科学院大学,北京 100049
  • 折叠

摘要

Abstract

Maximum use of local disk I/O resources is the key to improve computing cluster performance,but most of the scheduling algorithms in Hadoop system do not consider this factor.Aiming at this problem,a new task selection strategy is proposed,which takes the disk workload as a parameter in the procedure of MAP task selection and refers to each disk workload to choose the appropriate task during task scheduling,so as to achieve balanced disk workload on data nodes.Besides,a new task selection module is designed and integrated into the task scheduler of Hadoop.In order to further improve Hadoop system’s performance,an appropriate fully localized job execution mechanism is implemented. Experimental results prove that the proposed strategy makes full use of disk I/O resources,reduces I/O Wait by 5% on average,increases CPU utilization rate by 15% on average,and reduces the job execution time by 20%.

关键词

Hadoop系统/调度算法/数据本地性/任务选择策略/磁盘负载/I/O 性能

Key words

Hadoop system/scheduling algorithm/data locality/task selection strategy/disk workload/I/O performance

分类

信息技术与安全科学

引用本文复制引用

李强,孙震宇,雷晓凤,孙功星..基于磁盘I/O性能的Hadoop任务选择策略[J].计算机工程,2016,42(11):76-82,7.

基金项目

国家自然科学基金(11375223,11375221);国家自然科学基金委员会-中国科学院大科学装置联合基金(11179020)。 ()

计算机工程

OA北大核心CSCDCSTPCD

1000-3428

访问量4
|
下载量0
段落导航相关论文