计算机应用研究2024,Vol.41Issue(1):76-82,7.DOI:10.19734/j.issn.1001-3695.2023.05.0214
面向复杂数据审计需求的数据血缘构建方法
Data lineage construction method for complex data audit requirements
摘要
Abstract
For complex data audit requirements,existing methods rely on querying and analyzing the information of each exe-cution statement in the database,resulting in low efficiency of data audit.At present,there are also some methods that use da-ta lineage tools for quick search,but these methods require intrusion into the system to obtain source code,which can easily cause data leakage or malicious tampering.In response to these issues,this paper proposed a data lineage construction method for complex data audit requirements,integrating key technologies such as log preprocessing,data relationship analysis,and data alignment.By analyzing the system's running log information,it constructed the data lineage graph in a non-invasive manner,and formed a data audit tool for the tobacco logistics inbound and outbound.This paper took 155 728 transaction logs corresponding to 13 796 batches of goods in the tobacco logistics as the test dataset and conducted comparative experiments from three aspects,such as completeness,construction cost,and data audit efficiency.The experimental results show that the proposed method can complete the query task within 10 s,occupying a memory of 1.23 MB/hundred items,which is obviously less than the existing methods.Compared with the existing methods,the proposed method can construct a complete and accu-rate data lineage at the data level granularity,and using the data lineage constructed by this proposed method can greatly im-prove the efficiency of data auditing in the cigarette logistics.关键词
数据血缘/非侵入式/数据审计/卷烟物流/自动化作业Key words
data lineage/non-invasive/data audit/cigarette logistics/automated job分类
信息技术与安全科学引用本文复制引用
潘晓华,金泳,高扬华,朱心洲,沈诗婧..面向复杂数据审计需求的数据血缘构建方法[J].计算机应用研究,2024,41(1):76-82,7.基金项目
浙江省科技计划资助项目(2023C01213) (2023C01213)
"尖兵""领雁"研发攻关计划资助项目 ()