| 注册
首页|期刊导航|高技术通讯|基于重用的作业合并执行优化技术

基于重用的作业合并执行优化技术

张进东 谭光明

高技术通讯2025,Vol.35Issue(10):1037-1050,14.
高技术通讯2025,Vol.35Issue(10):1037-1050,14.DOI:10.3772/j.issn.1002-0470.2025.10.001

基于重用的作业合并执行优化技术

Reuse-based job merging execution optimization techniques

张进东 1谭光明2

作者信息

  • 1. 中国科学院计算技术研究所高性能计算研究中心 北京 100190||中国科学院大学 北京 100049
  • 2. 中国科学院计算技术研究所高性能计算研究中心 北京 100190
  • 折叠

摘要

Abstract

With the rise of cloud computing and big data analytics,large-scale job services running in distributed clusters often exhibit significant job overlapping.Effectively identifying and reusing computation overlaps is crucial to miti-gate job delays and memory overhead caused by data and computation reuse in large-scale data processing applica-tions.To address this challenge,this paper proposes a reuse-based job merge execution system named MergeLap.MergeLap employs a job structure signature mechanism and a cost model-based common substructure selection strat-egy to efficiently identify and search for extremely maximal common substructures.By utilizing a chain cache struc-ture for substructure caching,intermediate results can be stored for fast indexing while reducing memory consump-tion.Experimental results demonstrate that the proposed approach effectively reduces job execution time and im-proves memory usage efficiency.Compared with native SparkSQL,MergeLap reduces the running time of batch jobs across multiple workloads by up to 46.5%,and decreases cache usage by up to 60.7%.

关键词

数据重用/计算重用/公共子结构/作业合并/成本模型

Key words

data reuse/computation reuse/common substructure/job merging/cost model

引用本文复制引用

张进东,谭光明..基于重用的作业合并执行优化技术[J].高技术通讯,2025,35(10):1037-1050,14.

基金项目

国家自然科学基金(62032023)资助项目. (62032023)

高技术通讯

OA北大核心

1002-0470

访问量0
|
下载量0
段落导航相关论文