计算机应用与软件2016,Vol.33Issue(5):44-46,60,4.DOI:10.3969/j.issn.1000-386x.2016.05.012
一种批处理块级数据去重方法
A CHUNK-BASED DE-DUPLICATION METHOD BASED ON BATCH PROCESS
摘要
Abstract
Data de-duplication technology,which eliminates the redundant data from backups to save network bandwidth and storage resources,has become a hot research topic in current data storage field.The commonly used chunk-based de-duplication technology suffers from high overhead in fingerprint lookup and low system throughput.In light of this,the paper proposes a chunk-based de-duplication method using batch process,which sorts the fingerprints in memory buffer to achieve the sequential lookup of disk indexes.Moreover,the files are stored to the system in a structure of bi-pointer-based directed acyclic graphs so as to eliminate random small disk I /O costs caused by file read.Experimental results show that this method breaks the disk I /O bottleneck of fingerprint lookup and hence improves read-write performance of the system during data de-duplication.关键词
备份/数据去重/指纹查询/批处理Key words
Backup/Data de-duplication/Fingerprint query/Batch process分类
信息技术与安全科学引用本文复制引用
杨天明,吴海涛..一种批处理块级数据去重方法[J].计算机应用与软件,2016,33(5):44-46,60,4.基金项目
河南省科技项目(142300410288,132400411178);河南省教育厅科学技术研究重点项目(112102210446)。 ()