| 注册
首页|期刊导航|计算机应用研究|基于统计推理的不一致数据清洗方法

基于统计推理的不一致数据清洗方法

张安珍 胡生吉 夏秀峰

计算机应用研究2024,Vol.41Issue(10):2987-2992,6.
计算机应用研究2024,Vol.41Issue(10):2987-2992,6.DOI:10.19734/j.issn.1001-3695.2024.02.0055

基于统计推理的不一致数据清洗方法

Cleaning inconsistent data based on statistical inference

张安珍 1胡生吉 2夏秀峰2

作者信息

  • 1. 中国科学院沈阳计算技术研究所,沈阳 110168||沈阳航空航天大学计算机学院,沈阳 110136
  • 2. 沈阳航空航天大学计算机学院,沈阳 110136
  • 折叠

摘要

Abstract

Inconsistent data repair is an important research direction in the field of data repair.Most of the existing methods are based on integrity constraint rules and use the principle of minimum cost for repair.However,the repair scheme with the minimum cost is usually incorrect,which leads to the low accuracy rate of the existing repair methods.To address the problem of low accuracy of existing methods,this paper proposed an inconsistent data repair method based on statistical inference BayesOUR,to balance the cost and quality of repair and improve the repair accuracy.It mainly divided BayesOUR into three phases.Firstly,it performed error detection based on the integrity constraint rule,and then utilized Bayesian network to reason about the probability of all the possible consistent repair schemes.Finally,it selected the repair scheme with the largest proba-bility for data repair.Experimental results on real data show that the method in this paper can significantly improve the accura-cy of inconsistent data repair compared with the current leading methods.

关键词

不一致数据/贝叶斯网络/统计推理

Key words

inconsistent data/Bayesian network/probabilistic inference

分类

信息技术与安全科学

引用本文复制引用

张安珍,胡生吉,夏秀峰..基于统计推理的不一致数据清洗方法[J].计算机应用研究,2024,41(10):2987-2992,6.

基金项目

国家自然科学基金青年基金资助项目(6210071734) (6210071734)

计算机应用研究

OA北大核心CSTPCD

1001-3695

访问量0
|
下载量0
段落导航相关论文