| 注册
首页|期刊导航|计算机应用研究|面向复杂数据审计需求的数据血缘构建方法

面向复杂数据审计需求的数据血缘构建方法

潘晓华 金泳 高扬华 朱心洲 沈诗婧

计算机应用研究2024,Vol.41Issue(1):76-82,7.
计算机应用研究2024,Vol.41Issue(1):76-82,7.DOI:10.19734/j.issn.1001-3695.2023.05.0214

面向复杂数据审计需求的数据血缘构建方法

Data lineage construction method for complex data audit requirements

潘晓华 1金泳 2高扬华 2朱心洲 1沈诗婧3

作者信息

  • 1. 浙江大学软件学院,杭州 310058
  • 2. 浙江中烟工业有限责任公司信息中心,杭州 310007
  • 3. 浙江大学滨江研究院国产信创技术研究中心,杭州 310053
  • 折叠

摘要

Abstract

For complex data audit requirements,existing methods rely on querying and analyzing the information of each exe-cution statement in the database,resulting in low efficiency of data audit.At present,there are also some methods that use da-ta lineage tools for quick search,but these methods require intrusion into the system to obtain source code,which can easily cause data leakage or malicious tampering.In response to these issues,this paper proposed a data lineage construction method for complex data audit requirements,integrating key technologies such as log preprocessing,data relationship analysis,and data alignment.By analyzing the system's running log information,it constructed the data lineage graph in a non-invasive manner,and formed a data audit tool for the tobacco logistics inbound and outbound.This paper took 155 728 transaction logs corresponding to 13 796 batches of goods in the tobacco logistics as the test dataset and conducted comparative experiments from three aspects,such as completeness,construction cost,and data audit efficiency.The experimental results show that the proposed method can complete the query task within 10 s,occupying a memory of 1.23 MB/hundred items,which is obviously less than the existing methods.Compared with the existing methods,the proposed method can construct a complete and accu-rate data lineage at the data level granularity,and using the data lineage constructed by this proposed method can greatly im-prove the efficiency of data auditing in the cigarette logistics.

关键词

数据血缘/非侵入式/数据审计/卷烟物流/自动化作业

Key words

data lineage/non-invasive/data audit/cigarette logistics/automated job

分类

信息技术与安全科学

引用本文复制引用

潘晓华,金泳,高扬华,朱心洲,沈诗婧..面向复杂数据审计需求的数据血缘构建方法[J].计算机应用研究,2024,41(1):76-82,7.

基金项目

浙江省科技计划资助项目(2023C01213) (2023C01213)

"尖兵""领雁"研发攻关计划资助项目 ()

计算机应用研究

OA北大核心CSTPCD

1001-3695

访问量5
|
下载量0
段落导航相关论文