计算机应用研究2024,Vol.41Issue(10):2987-2992,6.DOI:10.19734/j.issn.1001-3695.2024.02.0055
基于统计推理的不一致数据清洗方法
Cleaning inconsistent data based on statistical inference
摘要
Abstract
Inconsistent data repair is an important research direction in the field of data repair.Most of the existing methods are based on integrity constraint rules and use the principle of minimum cost for repair.However,the repair scheme with the minimum cost is usually incorrect,which leads to the low accuracy rate of the existing repair methods.To address the problem of low accuracy of existing methods,this paper proposed an inconsistent data repair method based on statistical inference BayesOUR,to balance the cost and quality of repair and improve the repair accuracy.It mainly divided BayesOUR into three phases.Firstly,it performed error detection based on the integrity constraint rule,and then utilized Bayesian network to reason about the probability of all the possible consistent repair schemes.Finally,it selected the repair scheme with the largest proba-bility for data repair.Experimental results on real data show that the method in this paper can significantly improve the accura-cy of inconsistent data repair compared with the current leading methods.关键词
不一致数据/贝叶斯网络/统计推理Key words
inconsistent data/Bayesian network/probabilistic inference分类
信息技术与安全科学引用本文复制引用
张安珍,胡生吉,夏秀峰..基于统计推理的不一致数据清洗方法[J].计算机应用研究,2024,41(10):2987-2992,6.基金项目
国家自然科学基金青年基金资助项目(6210071734) (6210071734)