| 注册
首页|期刊导航|数据采集与处理|基于压缩的本地差分隐私的序列数据收集方法

基于压缩的本地差分隐私的序列数据收集方法

金严 朱友文 吴启晖

数据采集与处理2025,Vol.40Issue(3):659-674,16.
数据采集与处理2025,Vol.40Issue(3):659-674,16.DOI:10.16337/j.1004-9037.2025.03.008

基于压缩的本地差分隐私的序列数据收集方法

Sequential Data Collection Method with Condensed Local Differential Privacy

金严 1朱友文 1吴启晖2

作者信息

  • 1. 南京航空航天大学计算机科学与技术学院,南京 211106
  • 2. 南京航空航天大学电子信息工程学院,南京 211106
  • 折叠

摘要

Abstract

Condensed local differential privacy is a metric-based relaxation of local differential privacy with better utility and flexibility than local differential privacy.However,existing solutions are deficient in terms of sequence pattern capture and utility.To address these limitations,this paper proposes SCM-CLDP,a novel sequential data collection method based on condensed local differential privacy.SCM-CLDP fully takes into account important information such as the length and transitions of sequential data during the collection process,through which the data collector is able to synthesize privacy-preserving dataset close to the original dataset.Specifically,according to different perturbation objects,we propose two collection methods,SCM-VP based on value perturbation and SCM-TP based on transition perturbation,respectively.We theoretically prove that SCM-VP and SCM-TP satisfy sequence-level condensed local differential privacy,and comparative experiments are conducted with existing solutions based on two real datasets in terms of Markov chain model accuracy,synthetic dataset utility,and frequent sequence pattern mining accuracy.The results show that SCM-CLDP performs significantly better than the existing solutions,with SCM-VP outperforming SCM-TP in most cases.In the optimal situation,SCM-CLDP reduces the error of the Markov chain model and the distribution of the synthetic dataset by at least one order of magnitude compared to the existing method.Meanwhile,SCM-CLDP improves the accuracy of item frequency ranking of the synthetic dataset and the accuracy of frequent sequence pattern mining by nearly 30%compared to existing solutions.

关键词

压缩的本地差分隐私/序列数据/Markov链模型/数据收集/隐私保护

Key words

condensed local differential privacy/sequential data/Markov chain model/data collection/privacy protection

分类

计算机与自动化

引用本文复制引用

金严,朱友文,吴启晖..基于压缩的本地差分隐私的序列数据收集方法[J].数据采集与处理,2025,40(3):659-674,16.

基金项目

江苏省重点研发计划(产业前瞻与关键核心技术)(BE2022068,BE2022068-1) (产业前瞻与关键核心技术)

国家自然科学基金(62172216) (62172216)

中央高校基本科研业务费项目(NP2024117) (NP2024117)

稳定支持国防特色学科基础研究项目(ILF240061A24). (ILF240061A24)

数据采集与处理

OA北大核心

1004-9037

访问量0
|
下载量0
段落导航相关论文