首页|期刊导航|计算机应用研究|基于统计显著性检验的高效用项集挖掘算法

基于统计显著性检验的高效用项集挖掘算法

吴军魏丹丹欧阳艾嘉王亚

计算机应用研究2024，Vol.41Issue(10)：2970-2977,8.

计算机应用研究2024，Vol.41Issue(10)：2970-2977,8.DOI:10.19734/j.issn.1001-3695.2024.01.0027

基于统计显著性检验的高效用项集挖掘算法

Mining high utility itemsets based on statistical significance testing

吴军 ¹魏丹丹 ¹欧阳艾嘉 ¹王亚¹

作者信息

1. 遵义师范学院信息工程学院,贵州遵义 563000
折叠

摘要

Abstract

Aiming at the problem of traditional high utility itemset mining algorithms reporting false positive high utility item-sets in transactions with class labels,this paper proposed two high utility itemset mining algorithms called FHUI and PHUI.The FHUI and PHUI firstly found all the candidates and grouped them by length.Then,the FHUI established null distribu-tions with the frequency distributions,while the PHUI established null distributions by the permutation strategy within or be-tween transactions.Finally,the FHUI and PHUI calculated the p values from the null distributions and exploited the false dis-covery rate to eliminate the false positive high utility itemsets.The experiments on the benchmark data sets show that the FHUI and PHUI can eliminate a large number of false positive itemsets,which allows them to achieve higher accuracy rates in the classification tasks.The experiments on synthetic data sets reveal that the proportions of false positive itemsets reported by FHUI and PHUI are lower than 4.8％and the average utility values are higher than 39 000.Experimental results prove that the statistically significant high utility itemsets reported by the FHUI and PHUI are more reliable and practical in transactions with class labels.

关键词

数据挖掘/高效用项集挖掘/统计显著性检验/Fisher检验/置换检验

Key words

data mining/high utility itemset mining/statistical significance testing/Fisher testing/permutation testing

分类

信息技术与安全科学

引用本文复制引用

吴军,魏丹丹,欧阳艾嘉,王亚..基于统计显著性检验的高效用项集挖掘算法[J].计算机应用研究,2024,41(10):2970-2977,8.

基金项目

国家自然科学基金资助项目(62066049) （62066049）

贵州省教育厅高等学校青年资助项目(黔教技[2022]313,黔教合KY[2022]015) （黔教技[2022]313,黔教合KY[2022]015）

贵州省科技厅科技支撑计划资助项目(黔科合支撑[2023]257) （黔科合支撑[2023]257）

遵义市科技合作资助项目(遵市科合HZ字(2022)123) （遵市科合HZ字(2022）

计算机应用研究

OA北大核心CSTPCD

ISSN：1001-3695

访问量3

下载量0

段落导航