| 注册
首页|期刊导航|计算机工程与应用|海量多维数据的存储与查询研究

海量多维数据的存储与查询研究

宋爱波 万雨桐 贡欢 薛荧荧

计算机工程与应用2016,Vol.52Issue(13):25-31,7.
计算机工程与应用2016,Vol.52Issue(13):25-31,7.DOI:10.3778/j.issn.1002-8331.1601-0357

海量多维数据的存储与查询研究

Research on storage and query of large-scale multidimensional data.

宋爱波 1万雨桐 1贡欢 1薛荧荧1

作者信息

  • 1. 东南大学 计算机科学与工程学院,南京 211189
  • 折叠

摘要

Abstract

The OLAP(Online Analytical Processing) system built on warehouse is the most popular tool to analyze large-scale multidimensional data. With the development of information technology, data volume grows rapidly and data structure becomes more and more complicated, so the performance of OLAP system has dropped severely, failing to meet daily data analysis needs. This paper proposes new methods to store large-scale multidimensional data and perform aggre-gation query with Hadoop, a parallel computing system. The paper implements a new column-store format HCFile(HDFS column file), and proposals a new storage solution based on it. This project can improve the efficiency of aggregation, with a good scalability. Meanwhile, this paper leverages the hierarchy schema to build dimension hierarchy index, and uses MapReduce to perform efficiency aggregation query. Through comparison experiments with Hive, it proves that the proposed storage solution and aggregation query can effectively improve the efficiency of large-scale multidimensional data analysis.

关键词

海量多维数据/Hadoop/数据索引/聚集查询

Key words

large-scale multidimensional data/Hadoop/data index/aggregation query

分类

信息技术与安全科学

引用本文复制引用

宋爱波,万雨桐,贡欢,薛荧荧..海量多维数据的存储与查询研究[J].计算机工程与应用,2016,52(13):25-31,7.

基金项目

国家自然科学基金(No.61370207,No.61572128) (No.61370207,No.61572128)

国家电网公司总部科技项目. ()

计算机工程与应用

OA北大核心CSCDCSTPCD

1002-8331

访问量0
|
下载量0
段落导航相关论文