| 注册
首页|期刊导航|大数据|融合多分组归并的券商数据Shuffle和数据倾斜算法

融合多分组归并的券商数据Shuffle和数据倾斜算法

曹亚坤 唐小勇

大数据2025,Vol.11Issue(6):123-142,20.
大数据2025,Vol.11Issue(6):123-142,20.DOI:10.11959/j.issn.2096-0271.2025074

融合多分组归并的券商数据Shuffle和数据倾斜算法

Multi group merging algorithm for solving data Shuffle and data skew of securities companies

曹亚坤 1唐小勇1

作者信息

  • 1. 长沙理工大学计算机与通信工程学院,湖南 长沙 410114
  • 折叠

摘要

Abstract

In the securities industry,the processing and analysis of user data are critical technologies that significantly impact business decision-making and risk control.However,the vast scale and complexity of user data securities companies led to significant Shuffle operations and data skew issues in big data computations.Existing optimization methods either relied on hardware upgrades or were limited by domain-specific constraints,failing to address the problem effectively.To resolve this,a multi-group merging algorithm(MGMA)based on user relationships was proposed,which improved computational efficiency and reduces resource consumption through effective grouping and optimization strategies.Experimental results showed that,compared to the no optimized(NO)control group,MGMA algorithm achieved a 20%data skew rate,72%memory usage,and 61%computation time.All three indicators surpass those of the other four comparison optimization methods.

关键词

Shuffle操作/数据倾斜/预处理/券商数据

Key words

Shuffle operations/data skew/preprocessing/data of securities companies

分类

计算机与自动化

引用本文复制引用

曹亚坤,唐小勇..融合多分组归并的券商数据Shuffle和数据倾斜算法[J].大数据,2025,11(6):123-142,20.

大数据

2096-0271

访问量0
|
下载量0
段落导航相关论文