信阳师范学院学报(自然科学版)2024,Vol.37Issue(4):442-448,7.DOI:10.3969/j.issn.1003-0972.2024.04.004
基于闭合频繁模式的半随机森林数据流分类算法
Semi-Random Forest Classification Based on Closed Frequent Pattern for Data Streams
摘要
Abstract
To solve the issues of noise and concept drift exists in the data stream,a Semi-Random Forest Classification based on Closed Frequent Pattern(SRFCFP)for Data Streams algorithm was proposed.SRFCFP used the closed frequent patterns to represent the input data stream to remove redundant information and noise and highlight the characteristics of data.Semi-random forests were used to construct the classifier after representation,and a pattern set updating mechanism based on time decay model was proposed for the continuous data stream.Meanwhile,in order to detect and adapt to concept drift in time,a difference measurement method for pattern set was proposed,which used the mined patterns to measure distribution changes.The experiments were performed under the MOA using real-world datasets and synthetic datasets,respectively.The results showed that the proposed method can outperform the related comparison algorithm in average accuracy,and can effectively deal with the concept drift and noise.关键词
数据流/闭合频繁模式/半随机森林/概念漂移/噪声Key words
data stream/closed frequent pattern/semi-random forest/concept drift/noise分类
信息技术与安全科学引用本文复制引用
孙艳歌,邵罕,蒋明毅..基于闭合频繁模式的半随机森林数据流分类算法[J].信阳师范学院学报(自然科学版),2024,37(4):442-448,7.基金项目
国家自然科学基金项目(61702550) (61702550)
河南省研究生质量工程项目(YJS2023SZ23) (YJS2023SZ23)