首页|期刊导航|中山大学学报(自然科学版)|Spark DAG优化MapReduce协同过滤算法

Spark DAG优化MapReduce协同过滤算法

廖彬张陶于炯国冰磊张旭光刘炎

中山大学学报(自然科学版)2017，Vol.56Issue(3)：46-56,11.

中山大学学报(自然科学版)2017，Vol.56Issue(3)：46-56,11.DOI:10.13471/j.cnki.acta.snus.2017.03.008

Spark DAG优化MapReduce协同过滤算法

Optimization of collaborative filtering algorithm based on DAG Spark scheduling

廖彬 ¹张陶 ²于炯 ³国冰磊 ²张旭光 ²刘炎¹

作者信息

1. 新疆财经大学统计与信息学院, 新疆乌鲁木齐 830012
2. 新疆大学信息科学与工程学院, 新疆乌鲁木齐 830008
3. 新疆医科大学医学工程技术学院, 新疆乌鲁木齐 830011
折叠

摘要

Abstract

The scale effect of big data has brought great challenges to data storage,management and analysis.And the high efficiency and low cost big data processing technology has become a hotspot research in academia and industry.In order to improve the efficiency of collaborative filtering algorithms,the implementation of the algorithm under the MapReduce architecture is decomposed in order to analysis the defects of the algorithm.For the Spark suitable for the iterative and interactive tasks,this paper presents the methods to improve the execution efficiency from the MapReduce platform to the Spark platform.The implementation flow of the algorithm in Spark is designed,and efficiency is improved by parameter adjustment and memory optimization.Experimental results show that: based on spark DAG scheduling,the algorithm can reduce more than 65% HDFS I/O operations and enforce the efficiency and energy efficiency were increased by nearly 200% and 50%.

关键词

协同过滤/MapReduce/Spark/算法优化/能耗优化

Key words

collaborative filtering/MapReduce/Spark/algorithm optimization/energy consumption optimization

分类

计算机与自动化

引用本文复制引用

廖彬,张陶,于炯,国冰磊,张旭光,刘炎..Spark DAG优化MapReduce协同过滤算法[J].中山大学学报(自然科学版),2017,56(3):46-56,11.

基金项目

国家自然科学基金(61562078,61262088) （61562078,61262088）

新疆维吾尔自治区自然科学基金(2016D01B014) （2016D01B014）

新疆财经大学博士启动基金(2015BS007) （2015BS007）

中山大学学报(自然科学版)

OA北大核心CSCDCSTPCD

ISSN：0529-6579

访问量0

下载量0

段落导航