| 注册
首页|期刊导航|计算机工程与应用|一种Spark集群下的shuffle优化机制

一种Spark集群下的shuffle优化机制

熊安萍 夏玉冲 杨方方

计算机工程与应用2018,Vol.54Issue(4):72-76,5.
计算机工程与应用2018,Vol.54Issue(4):72-76,5.DOI:10.3778/j.issn.1002-8331.1701-0238

一种Spark集群下的shuffle优化机制

Shuffle optimization for Spark cluster

熊安萍 1夏玉冲 2杨方方1

作者信息

  • 1. 重庆邮电大学 计算机科学与技术学院,重庆400065
  • 2. 重庆市移动互联网数据应用工程技术研究中心,重庆400065
  • 折叠

摘要

Abstract

Spark is a distributed processing framework based on memory. The large amounts of data generated by the shuffle process deeply affect the network transmission,which has become one of the main bottlenecks of the Spark perfor-mance. In order to solve the problem of unbalanced data distribution resulting in the I/O load imbalance in different nodes, a restart policy based on task local level is designed. Finally, the optimization mechanism is verified by experi-ments,which can reduce the execution time of task and improve the efficiency of shuffle process.

关键词

Spark集群/shuffle过程/数据传输/本地性/调度策略

Key words

Spark cluster/shuffle process/data transfer/locality/schedule strategy

分类

信息技术与安全科学

引用本文复制引用

熊安萍,夏玉冲,杨方方..一种Spark集群下的shuffle优化机制[J].计算机工程与应用,2018,54(4):72-76,5.

基金项目

重庆邮电大学博士启动基金(No.A2015-17). (No.A2015-17)

计算机工程与应用

OA北大核心CSCDCSTPCD

1002-8331

访问量0
|
下载量0
段落导航相关论文