| 注册
首页|期刊导航|吉林大学学报(理学版)|基于MapReduce的两表数据倾斜连接的优化算法

基于MapReduce的两表数据倾斜连接的优化算法

赵宇兰

吉林大学学报(理学版)2016,Vol.54Issue(6):1383-1387,5.
吉林大学学报(理学版)2016,Vol.54Issue(6):1383-1387,5.DOI:10.13413/j.cnki.jdxblxb.2016.06.35

基于MapReduce的两表数据倾斜连接的优化算法

Optimization Algorithm of Two Table Data Skew Join Based on MapReduce

赵宇兰1

作者信息

  • 1. 山西大学商务学院 信息学院,太原 030031
  • 折叠

摘要

Abstract

Aiming at the problem that Range partition algorithm could not optimize two table join efficiency,which contained heavily skewed data,we proposed an improved algorithm for the data skew connection.The algorithm took different treatment for skew data and non-skew data,sent data to each Reduce node by using the methods of replicating and broadcasting,and completed all the connection operation through a round of Map/Reduce tasks.The algorithm could effectively balance processing of each Reduce,which solved the impact of the heavily skewed data on the performance of two table join.The results show that the algorithm is effective by comparing with the traditional partition join algorithm.

关键词

MapReduce/Range partition算法/数据倾斜/连接算法优化

Key words

MapReduce/Range partition algorithm/data skew/optimization of join algorithm

分类

信息技术与安全科学

引用本文复制引用

赵宇兰..基于MapReduce的两表数据倾斜连接的优化算法[J].吉林大学学报(理学版),2016,54(6):1383-1387,5.

基金项目

国家自然科学基金(批准号:61303107) (批准号:61303107)

吉林大学学报(理学版)

OA北大核心CSCDCSTPCD

1671-5489

访问量0
|
下载量0
段落导航相关论文