计算机应用与软件2018,Vol.35Issue(4):33-43,11.DOI:10.3969/j.issn.1000-386x.2018.04.007
可扩展的流数据Join处理框架
A FRAMEWORK FOR SCALABLE STREAM JOIN PROCESSING
摘要
Abstract
Join operation is very important for stream query processing.Multiple stream queries were often posed on a single input stream pair,which led to the concurrent data join task.Consequently,the workload of join operations is increased,with larger join window and higher stream input rates.We urgently need a generic (purpose-independent) stream processing mechanism that efficiently handles multiple concurrent join tasks.To achieve this goal,in this paper we proposed S2J,a scalable stream join processing framework,that adopted a dataflow-oriented processing model,to perform each join task by distributing the load to an appropriate number of chained join workers and employing a tuple-block-based message passing protocol to reduce the communication overhead.This framework was efficient for theta-join,and provided real-time and result-integrity guarantees for the join processing.A large number of experiments had proved the efficiency and effectiveness of this framework.关键词
连接操作/流数据/查询/分布式环境/优化Key words
Join operation/Stream/Query/Distributed environment/Optimization分类
信息技术与安全科学引用本文复制引用
赛影辉,黄浩..可扩展的流数据Join处理框架[J].计算机应用与软件,2018,35(4):33-43,11.基金项目
国家自然科学基金项目(61502347). (61502347)