| 注册
首页|期刊导航|计算机工程|实时数据仓库中一种改进的数据流更新算法

实时数据仓库中一种改进的数据流更新算法

潘郑冰 戴牡红

计算机工程Issue(10):43-46,51,5.
计算机工程Issue(10):43-46,51,5.DOI:10.3969/j.issn.1000-3428.2014.10.009

实时数据仓库中一种改进的数据流更新算法

An Improved Data Stream Update Algorithm in Real-time Data Warehouse

潘郑冰 1戴牡红1

作者信息

  • 1. 湖南大学软件学院,长沙410082
  • 折叠

摘要

Abstract

To achieve data efficient integration in data warehouse, aiming at the phenomenon of data skew distribution,this paper proposes an improved data stream update algorithm---Extended Hybrid Join( EH-JOIN) . The algorithm improves the traditional Hash join method,and it can adapt to common skewed data and greatly reduce the disk I/O cost through using index structure and storing some parts of the master data in memory. Experimental results show that the service rate of proposed algorithm is improved by 96% and 80% compared with MESHJOIN algorithm and R-MESHJOUIN algorithm as the relation set keeps an appropriate size,and the service rate of proposed algorithm is improved by 57% and 48% compared with MESHJOIN algorithm and R-MESHJOUIN algorithm as the memory size differs.

关键词

实时数据仓库/数据转换/数据流更新/基于流的连接/哈希索引/偏斜分布

Key words

real-time data warehouse/data transformation/data stream update/stream-based join/Hash index/skewed distribution

分类

信息技术与安全科学

引用本文复制引用

潘郑冰,戴牡红..实时数据仓库中一种改进的数据流更新算法[J].计算机工程,2014,(10):43-46,51,5.

基金项目

湖南省自然科学基金资助项目(2011FJ3034)。 (2011FJ3034)

计算机工程

OA北大核心CSCDCSTPCD

1000-3428

访问量0
|
下载量0
段落导航相关论文