哈尔滨工程大学学报2018,Vol.39Issue(4):721-729,9.DOI:10.11990/jheu.201611075
基于滑动窗口挖掘数据流高效用项集的有效算法
An efficient algorithm for mining high utility itemsets from data streams based on sliding window techniques
摘要
Abstract
Existing algorithms for HUIM over a sliding window have two problems:the number of candidates is usu-ally very large and extensive memory is required,and candidate verification is time-consuming.Thus,in this paper an efficient HUISW algorithm(high utility itemset mining over a siding window)for mining high utility itemsets from a data stream without candidates is proposed.HUISW adopts a novel tree structure HUIL-Tree(a high utility itemset tree that arranges items according to lexicographic order)to store the information on the itemsets in a sliding window,and a utility database to store the utility information on the itemsets in the transactions of a window.Dur-ing the mining process,the pattern-growth method was used to generate itemsets from HUIL-Tree.For each itemset generated,its utility in the window was calculated directly using the corresponding relationship between the itemset and the utility database.The whole process did not generate candidates.Extensive experiments on both sparse and dense stream datasets were performed to compare HUISW with the state-of-the-art algorithm SHU-Growth(siding window based high utility growth).The experimental results show that HUISW significantly outperforms SHU-Growth as the runtime of HUISW was two orders of magnitude faster.关键词
高效用项集/模式增长/数据流/效用挖掘/滑动窗口/数据挖掘Key words
high utility itemsets/pattern growth/data streams/utility mining/sliding window/data mining分类
信息技术与安全科学引用本文复制引用
郭世明,高宏..基于滑动窗口挖掘数据流高效用项集的有效算法[J].哈尔滨工程大学学报,2018,39(4):721-729,9.基金项目
国家自然科学基金项目(61190115). (61190115)