| 注册
首页|期刊导航|计算机应用研究|MSOLA:基于多维分层采样的大数据在线聚集技术

MSOLA:基于多维分层采样的大数据在线聚集技术

史英杰 杜方 尤亚东

计算机应用研究2018,Vol.35Issue(2):375-380,6.
计算机应用研究2018,Vol.35Issue(2):375-380,6.DOI:10.3969/j.issn.1001-3695.2018.02.013

MSOLA:基于多维分层采样的大数据在线聚集技术

MSOLA:big data online aggregation based on multi-dimension stratified sampling

史英杰 1杜方 2尤亚东1

作者信息

  • 1. 北京服装学院信息工程学院,北京100029
  • 2. 宁夏大学信息工程学院,银川750021
  • 折叠

摘要

Abstract

Online aggregation estimates the query results through statistical computing,and it can provides feedback to users before the query finishes,which is of paramount importance in the field of big data analysis.The existing studies always adopt uniform sampling,which results in estimate inaccuracy and slow convergence.This paper proposed the multi-dimension stratified sampling technique based on workload characteristics and data distribution,and designed the result estimate and confidence interval compute algorithms based on Storm.The experiments demonstrate that the proposed online aggregation technique improves the accuracy of estimated results in online aggregation with efficient scalability.

关键词

在线聚集/大数据/多维分层采样/负载分析

Key words

online aggregation(OLA)/big data/multi-dimension stratified sampling/workload analysis

分类

信息技术与安全科学

引用本文复制引用

史英杰,杜方,尤亚东..MSOLA:基于多维分层采样的大数据在线聚集技术[J].计算机应用研究,2018,35(2):375-380,6.

基金项目

国家自然科学基金资助项目(61502279,61363018) (61502279,61363018)

山东省自然科学基金资助项目(ZR2015FM013) (ZR2015FM013)

北京市教委科技计划项目(KM201710012008) (KM201710012008)

计算机应用研究

OA北大核心CSCDCSTPCD

1001-3695

访问量0
|
下载量0
段落导航相关论文