微型电脑应用2017,Vol.33Issue(6):64-66,80,4.
基于Spark平台的大数据挖掘技术研究
Research on Technology of Big Data Mining Based on Spark
王殉1
作者信息
- 1. 陕西学前师范学院教学设备与实验室管理处,西安710061
- 折叠
摘要
Abstract
Because big data have the characteristics of large amount of data and rich diversity,it must be demanding large data analysis both in processing speed and real-time requirements.Data mining technology is to use some modeling algorithm from massive data,to look for hidden information behind the data,so that big data can produce greater value.Spark framework is a low latency cluster distributed computing system for super large data sets.Based on the framework,this paper studies the big data mining technology.This paper designs and implements the Yarn deployment on the Spark cluster firstly,and then proposes and implements parallel Apriori algorithm.This algorithm successfully adds to the distributed algorithm of association analysis by the lack of Spark MLlib distributed machine learning repository.关键词
大数据/数据挖掘/SparkKey words
Big data/Data mining/Spark引用本文复制引用
王殉..基于Spark平台的大数据挖掘技术研究[J].微型电脑应用,2017,33(6):64-66,80,4.