计算机工程2016,Vol.42Issue(11):76-82,7.DOI:10.3969/j.issn.1000-3428.2016.11.013
基于磁盘I/O性能的Hadoop任务选择策略
Hadoop Task Selection Strategy Based on Disk I/O Performance
摘要
Abstract
Maximum use of local disk I/O resources is the key to improve computing cluster performance,but most of the scheduling algorithms in Hadoop system do not consider this factor.Aiming at this problem,a new task selection strategy is proposed,which takes the disk workload as a parameter in the procedure of MAP task selection and refers to each disk workload to choose the appropriate task during task scheduling,so as to achieve balanced disk workload on data nodes.Besides,a new task selection module is designed and integrated into the task scheduler of Hadoop.In order to further improve Hadoop system’s performance,an appropriate fully localized job execution mechanism is implemented. Experimental results prove that the proposed strategy makes full use of disk I/O resources,reduces I/O Wait by 5% on average,increases CPU utilization rate by 15% on average,and reduces the job execution time by 20%.关键词
Hadoop系统/调度算法/数据本地性/任务选择策略/磁盘负载/I/O 性能Key words
Hadoop system/scheduling algorithm/data locality/task selection strategy/disk workload/I/O performance分类
信息技术与安全科学引用本文复制引用
李强,孙震宇,雷晓凤,孙功星..基于磁盘I/O性能的Hadoop任务选择策略[J].计算机工程,2016,42(11):76-82,7.基金项目
国家自然科学基金(11375223,11375221);国家自然科学基金委员会-中国科学院大科学装置联合基金(11179020)。 ()