| 注册
首页|期刊导航|现代电子技术|大数据下MongoDB数据库档案文档存储去重研究

大数据下MongoDB数据库档案文档存储去重研究

贺建英

现代电子技术2015,Vol.38Issue(16):51-55,5.
现代电子技术2015,Vol.38Issue(16):51-55,5.

大数据下MongoDB数据库档案文档存储去重研究

Research on duplicated document removal in big data archive storage of MongoDB database

贺建英1

作者信息

  • 1. 四川文理学院 计算机学院,四川 达州 635000
  • 折叠

摘要

Abstract

In allusion to the present situation in document storage in case of big data,the MongoDB method to save docu-ments is proposed according to the reason analysis of duplication in document storage. GridFs of MongoDB is used to store different type documents. Three different assemblages are definited to store the uploader record,document information record and content of blocked documents respectively. A research is proposed for removing the duplication by checking whether MD 5 check code is same or not. It is significant to realize program code for duplicated document removal. The distributive memory database was used to enhance the expandability of the document saving system. The experimental result shows that this method can remove the duplicated documents effectively and improve the efficiency of inquiry.

关键词

MongoDB/MD5/大数据/档案文档去重/GridFs

Key words

MongoDB/MD5/big data/file document duplicate removal/GridFs

分类

信息技术与安全科学

引用本文复制引用

贺建英..大数据下MongoDB数据库档案文档存储去重研究[J].现代电子技术,2015,38(16):51-55,5.

基金项目

国家档案局项目:基于大数据的档案数据去重模型与方法研究(2014-X-65) (2014-X-65)

四川省教育厅一般项目:大数据环境下NoSQL数据库应用研究(14ZB0313) (14ZB0313)

现代电子技术

OA北大核心CSTPCD

1004-373X

访问量0
|
下载量0
段落导航相关论文