| 注册
首页|期刊导航|计算机工程|基于数据路由的分布式备份数据去重系统

基于数据路由的分布式备份数据去重系统

姚敏 尹建伟 唐彦 罗智凌

计算机工程2017,Vol.43Issue(2):85-91,7.
计算机工程2017,Vol.43Issue(2):85-91,7.DOI:10.3969/j.issn.1000-3428.2017.02.015

基于数据路由的分布式备份数据去重系统

Distributed Backup Data Deduplication System Based on Data Routing

姚敏 1尹建伟 1唐彦 1罗智凌1

作者信息

  • 1. 浙江大学计算机科学与技术学院,杭州310027
  • 折叠

摘要

Abstract

In big data scenarios,traditional data deduplication backup system faces with defects like large data backup storage space,insufficient data throughput and so on.Aiming at these defects,this paper designs a distributed backup data dedeplication system based on data routing.It uses data chunk as deduplication granularity,whose functions involve data routing and data prefetching.Data routing uses the Bloom filter to query data chunks to be processed,and applies average sampling and neighbor sampling based on Jaccard distance to prefetch data chunks.This system uses data routing to assign data chunks to the corresponding processing nodes to deal with.Data chunks' hash code obtained through average sampling provides routing information for data routing.And data chunks' hash code obtained through neighbor sampling is used for the first data deduplication of the system.Experimental results show that the data throughput of this system increases significantly compared with all processing node query and fixed data routing,while maintaining the deduplication ratio.

关键词

数据去重/数据路由/数据预取/布隆过滤器/Jaccard距离

Key words

data deduplication/data routing/data prefetching/Bloom filter/Jaccard distance

分类

信息技术与安全科学

引用本文复制引用

姚敏,尹建伟,唐彦,罗智凌..基于数据路由的分布式备份数据去重系统[J].计算机工程,2017,43(2):85-91,7.

基金项目

国家科技支撑计划项目“现代服务业跨界服务共性技术体系研发与示范应用”(2013AA01A213). (2013AA01A213)

计算机工程

OA北大核心CSCDCSTPCD

1000-3428

访问量0
|
下载量0
段落导航相关论文