计算机工程Issue(3):55-58,75,5.DOI:10.3969/j.issn.1000-3428.2014.03.011
基于矩阵的数据流Top-k频繁项集挖掘算法
Top-k Frequent Itemsets Mining Algorithm over Data Streams Based on Matrix
尹绍宏 1范桂丹1
作者信息
- 1. 天津工业大学计算机科学与软件学院,天津 300387
- 折叠
摘要
Abstract
The past algorithms produce large amounts of redundant itemsets, and they affect the efficiency of data mining. Therefore, a Top-k frequent itemsets mining algorithm over data streams based on matrix is proposed. Two 0-1 matrices, transaction matrix and 2-itemsets matrix, are introduced into the algorithm. Using transaction matrix to express the transaction list of a sliding window, and 2-itemsets matrix is obtained by calculating the support of each row. Then it can get candidate items by 2-itemsets matrix, and Top-k frequent itemsets are obtained by calculating the support of candidate items through logic and operation of correspond row in transaction matrix. Finally it saves the result of data mining into data dictionary. The algorithm can output the Top-k frequent itemsets by support in descendant order when user queries. Experimental results show that the algorithm avoids redundant itemsets in the process of data mining, and the efficiency of data mining is improved appreciably under the premise of accuracy.关键词
数据挖掘/数据流/滑动窗口/矩阵/Top-k频繁项集Key words
data mining/data stream/sliding window/matrix/Top-k frequent itemset分类
信息技术与安全科学引用本文复制引用
尹绍宏,范桂丹..基于矩阵的数据流Top-k频繁项集挖掘算法[J].计算机工程,2014,(3):55-58,75,5.