计算机应用与软件Issue(2):21-24,37,5.DOI:10.3969/j.issn.1000-386x.2016.02.005
面向大数据处理的 Hadoop 与 MongoDB 整合技术研究
ON MONGODB AND HADOOP INTEGRATION TECHNOLOGY FOR BIG DATA PROCESSING
摘要
Abstract
With the exponential growth in data variety and data volumes,NoSQL technology and MapReduce for scalable parallel analysis have garnered a lot of attentions.MongoDB,as a typical representative of NoSQL database,supports both scalable index and flexible query for massive data,but the MapReduce provided by MongoDB cannot meet the need of complex data analysis and computation.While Hadoop offers a powerful MapReduce framework for parallel computing,it performs high latency in real-time services.In view of this,we propose and implement four different integration schemes of Hadoop and MongoDB by considering comprehensively the factors of scalability,data locality and I /O performance.The optimal integration schemes under different scenarios are derived by designing three kinds of representative applications as the measuring benchmarks of performances and by performance contrastive experiments on different integration schemes. Experiments show that in the process of trade-off use of MongoDB and Hadoop,if reasonable integration schemes are applied to different applications,the performance can be improved up to 3 times.关键词
整合/MongoDB/Hadoop/大数据Key words
Integration/MongoDB/Hadoop/Big data分类
信息技术与安全科学引用本文复制引用
曾强,缪力,秦拯..面向大数据处理的 Hadoop 与 MongoDB 整合技术研究[J].计算机应用与软件,2016,(2):21-24,37,5.基金项目
国家自然科学基金项目(61272546)。 ()