华中师范大学学报(自然科学版)2017,Vol.51Issue(4):429-434,6.DOI:10.19603/j.cnki.1000-1190.2017.04.003
基于MapReduce的数据流频繁项集挖掘算法
An algorithm for mining frequent item sets from data streams based on MapReduce
摘要
Abstract
Aiming at the problems in traditional frequent item sets computing,such as low efficiency and large memory consumption,this paper presents a new frequent mining algorithm based on MapReduce parallel computing model.Firstly,in order to com press and transmit data,the data are divided into small pieces.Secondly,the calculation of frequent data distribution is in the load-balanced data nodes,which can improve the efficiency greatly.Finally,the dataset generated by each node are merged.The theoretical analysis and experimental results show that the algorithm is effective and feasible for dealing with the frequent item sets of data flow in parallel processing.关键词
MapReduce/频繁项集/数据流/并行计算/数据挖掘Key words
MapReduce/item sets/data streams/parallel computation/data mining分类
信息技术与安全科学引用本文复制引用
朱付保,白庆春,汤萌萌,朱颢东..基于MapReduce的数据流频繁项集挖掘算法[J].华中师范大学学报(自然科学版),2017,51(4):429-434,6.基金项目
国家自然科学基金项目(61201447) (61201447)
河南省科技攻关项目(122102210492) (122102210492)
河南省教育厅科学技术研究重点项目(13A520368,13A520367) (13A520368,13A520367)
河南省高等学校青年骨干教师资助计划项目(2014GGJS-084) (2014GGJS-084)
郑州轻工业学院校级青年骨干教师培养对象资助计划项目(XGGJS02) (XGGJS02)
郑州轻工业学院博士科研基金资助项目(2010BSJJ038). (2010BSJJ038)