华东师范大学学报(自然科学版)Issue(1):76-90,15.DOI:10.3969/j.issn.1000-5641.2018.01.008
面向概率RDF数据库查询的数据清洗
Data cleaning on probabilistic RDF database
摘要
Abstract
Due to the factors such as errors and noises in the process of obtaining and analyzing data,uncertain data arises in many domains,which has emerged as an important issue affecting the performance of data.Uncertain data can be stored in probabilistic databases and query facilities always yield answers with confidence.However,the accumulation and propagation of uncertainty may reduce the usability of the query results.As such,it is desirable to reduce the uncertainty of uncertain data.This paper aims at solving the problem how to promote the answers' certainty in RDF(resource description framework) graph query via crowdsourcing.The basic idea is to ask the crowd to decide whether the relationships represented by some edges are correct.In this paper,we introduce three different algorithms to select the edge which maximizes the uncertainty reduction.Finally,we verify these algorithms by experiments and show that unstable pruning algorithm and stable pruning algorithm perform better in term of efficiency.关键词
概率RDF图/众包/数据清洗Key words
probabilistic RDF graph/crowdsourcing/data cleaning分类
信息技术与安全科学引用本文复制引用
王桢,林欣..面向概率RDF数据库查询的数据清洗[J].华东师范大学学报(自然科学版),2018,(1):76-90,15.基金项目
国家自然科学基金(61572193) (61572193)