计算机与数字工程2019,Vol.47Issue(10):2405-2412,8.DOI:10.3969/j.issn.1672-9722.2019.10.006
基于Spark的大规模单图频繁子图算法
Single Large-Scale Graph Frequent Subgraph Algorithm Based on Spark
摘要
Abstract
With the rapid development of the Internet,the campus card has been widely popularized,and the data on the serv?er is also increasing rapidly. The single computer algorithm can not support frequent subgraph mining and growth pattern mining. The data mining of a large number of single graph frequent subgraphs can not be realized on a single machine. The Hadoop distribut?ed framework is not suitable for iterative algorithm. Therefore,In this paper,a distributed algorithm named FSMBUS for mining fre?quent subgraph in a single large-scale graph under Spark frame work is proposed. It constructs the parallel computing candidate sub?graphs by suboptimal CAM Tree,which returns all the frequent subgraphs for given user-defined minimum support. This experi?ments show that the single chart of new algorithms than the efficiency of FSMBUS is an order of magnitude slower,FSMBUS algo?rithm can support lower support threshold and larger map data mining,2~4 times faster than the efficiency of the Hadoop version of the transplant,analysis of our campus card can help college management and leadership of colleges and universities to put forward a reference basis.关键词
校园卡/Spark/频繁子图/分布式计算/大规模单图Key words
campus card/Spark/frequent subgraphs/distribute computing/large-scale chart分类
信息技术与安全科学引用本文复制引用
蒋来好,朱志祥,赵子晨..基于Spark的大规模单图频繁子图算法[J].计算机与数字工程,2019,47(10):2405-2412,8.基金项目
陕西省重点研发计划项目(编号:2016KTTSGY01-01) (编号:2016KTTSGY01-01)
西安邮电大学教学改革研究项目(编号:JGZ201615)资助. (编号:JGZ201615)