基于图常量条件函数依赖的图修复规则发现OACSTPCD
Graph Repairing Rule Discovery Based on Graph Constant Conditional Functional Dependencies
数据一致性是数据质量管理的一个重要内容.为了提升图数据一致性,大量关系型数据库中的数据依赖理论被引入到图数据库,包括图函数依赖、图关联规则等.图修复规则是最新提出的一种针对图数据的数据依赖规则,具有强大的修复能力,但目前尚无有效的挖掘算法.为了自动生成图修复规则并提高图数据修复的可靠性,提出一种将图常量条件函数依赖转化为图修复规则的方法(GenGRR).通过图模式在图中匹配同构子图并映射成节点-属性二维表,从表中相应属性域中抽取错误模式把图常量条件函数依赖转化成图属性值修复规则;删去图模式中常量条件函数依赖RHS对应的节点与相连边生成图属性补充规则.基于最大公共同构子图筛选并验证生成图修复规则的一致性.在多个真实数据集上进行测试,验证相比图常量条件函数直接修复图数据,通过转化生成的图修复规则具有更好的修复效果.
Data consistency is an important part of data quality management.In order to improve graph data consistency,a lot of data de-pendency theories in relational database have been introduced into graph database,including graph functional dependencies,graph association rules and so on.Graph repairing rule is a newly proposed data dependency rule for graph with powerful repairing capability,but there is no effective mining algorithm yet.In order to automatically generate graph repairing rule and improve the reliability of graph data repairing,a method called GenGRR is proposed to transform graph constant conditional functional dependencies into graph repairing rules.By using the graph pattern,the isomorphic subgraph is matched and mapped into a node-attribute two-dimensional table,and the error pattern is extracted from the corresponding attribute field in the table to transform the constant condition function dependency into the graph attribute value repair rule.The graph attribute supplement rules are generated by deleting the nodes and contiguous edges of constant condition function dependent on RHS in graph mode.Based on the maximum common isomorphic subgraph,the consistency of the repair rules of the generated graph is screened and verified.It is tested on multiple real data sets to verify that the graph repair rule generated by transformation has better repair effect than that of the graph constant condition function.
李杰;曹建军;王保卫;庄园
南京信息工程大学计算机学院网络空间安全学院,江苏南京 210044||国防科技大学第六十三研究所,江苏南京 210007||国防科技大学 大数据与决策实验室,湖南长沙 410073国防科技大学第六十三研究所,江苏南京 210007||国防科技大学 大数据与决策实验室,湖南长沙 410073南京信息工程大学计算机学院网络空间安全学院,江苏南京 210044
计算机与自动化
数据一致性数据质量图函数依赖图修复规则子图同构最大公共同构子图
data consistencydata qualitygraph functional dependencygraph repairing rulesubgraph isomorphismmaximum common isomorphism subgraph
《计算机技术与发展》 2024 (004)
7-15 / 9
国家自然科学基金资助项目(61972207);中国博士后科学基金特别资助项目(2015M582832);国家重大科技专项(2015ZX01040201-003)
评论