| 注册
首页|期刊导航|计算机科学与探索|面向MapReduce的数据处理流程开发方法

面向MapReduce的数据处理流程开发方法

易小华 刘杰 叶丹

计算机科学与探索2011,Vol.5Issue(2):161-169,9.
计算机科学与探索2011,Vol.5Issue(2):161-169,9.DOI:10.3778/j.issn.1673-9418.2011.02.006

面向MapReduce的数据处理流程开发方法

Development Method of MapReduce Oriented Data Flow Processing

易小华 1刘杰 2叶丹3

作者信息

  • 1. 中国科学院 软件研究所 软件工程技术中心,北京 100190
  • 2. 中国科学院 研究生院,北京 100190
  • 3. 中国科学技术大学 计算机科学与技术系,合肥 230026
  • 折叠

摘要

Abstract

In the age of information explosion, DataFlow processing widely existed and has shown new features and styles including massive and parallel, meanwhile more and more people choose to use MapReduce to process their data because of its simplicity and higher capability with lower cost, but MapReduce does not directly support complex N-step, N-branch and multiple data sets data flow processing. This paper proposes a model-driven development method for DataFlow processing based on MapReduce. It first defines the logical and physical models of the dataflow as well as the component model, then designs model transfer and code generation algorithms, finally uses the algorithms to generate the MapReduce program code which implements the function defined by the logical model and can run on Hadoop platform. Based on this method, a development tool CloudDataFlow is implemented. As the experiment shows, compared with similar system, it has higher performance, extendibility and usability.

关键词

MapReduce/数据处理流程/模型驱动/Hadoop平台

分类

信息技术与安全科学

引用本文复制引用

易小华,刘杰,叶丹..面向MapReduce的数据处理流程开发方法[J].计算机科学与探索,2011,5(2):161-169,9.

基金项目

The Major National Science and Technology Special Projects High-Nuclear-Based Project of China under Grant No.2009ZX01043-003-002(国家科技重大专项核高基项目) (国家科技重大专项核高基项目)

the National Science and Technology Support Program of China under Grant No.2009BAG18B01,2009BADA9B02(国家科技支撑计划). (国家科技支撑计划)

计算机科学与探索

OACSCDCSTPCD

1673-9418

访问量0
|
下载量0
段落导航相关论文