| 注册
首页|期刊导航|集成技术|基于LOGO计算框架的新型分布式机器学习算法库

基于LOGO计算框架的新型分布式机器学习算法库

梁展雄 孙旭东 蔡湧达 张育铭 麦朗杰 何玉林 黄哲学

集成技术2025,Vol.14Issue(3):24-37,14.
集成技术2025,Vol.14Issue(3):24-37,14.DOI:10.12146/j.issn.2095-3135.20240224001

基于LOGO计算框架的新型分布式机器学习算法库

A New Distributed Machine Learning Algorithm Library Based on LOGO Computing Framework

梁展雄 1孙旭东 1蔡湧达 1张育铭 1麦朗杰 1何玉林 2黄哲学3

作者信息

  • 1. 深圳大学计算机与软件学院 深圳 518060
  • 2. 人工智能与数字经济广东省实验室(深圳) 深圳 518107
  • 3. 深圳大学计算机与软件学院 深圳 518060||人工智能与数字经济广东省实验室(深圳) 深圳 518107
  • 折叠

摘要

Abstract

Unlike the popular MapReduce computing framework,LOGO is a new distributed computing framework using a LOcal-GlObal computing paradigm.Under the LOGO framework,big data distributed computing is completed in two steps.The LO operation runs a serial algorithm in a number of slave nodes or virtual machines to process independently the random sample data blocks,generating local results.The GO operation uploads all local results to the master node and integrate them to obtain the approximate result of the big data set.The LOGO computing framework eliminates data communication between nodes during iterations of the algorithm,greatly improving computing efficiency,reducing memory requirements,and enhancing data scalability.This article proposes a new distributed machine learning algorithm library under the LOGO computing framework.A new distributed computing is divided into two parts:the serial algorithm executed by the LO operation and the ensemble algorithm executed in the GO operation.The LO operation can directly execute existing serial machine learning algorithms without the need to rewrite them according to MapReduce.The GO operation executes ensemble algorithms of different kinds depending on the ensemble tasks.In this article,the principle of LOGO distributed computing is introduced first,followed by the algorithm library structure,the method for packaging existing serial algorithms and the ensemble strategy.Finally,implementation in Spark,App development,and the results of performance tests for various algorithms are demonstrated.

关键词

大数据分布式计算/分布式机器学习算法库/近似计算/非MapReduce计算

Key words

big data distributed computing/distributed machine learning algorithm library/approximate computing/non-MapReduce computing

分类

信息技术与安全科学

引用本文复制引用

梁展雄,孙旭东,蔡湧达,张育铭,麦朗杰,何玉林,黄哲学..基于LOGO计算框架的新型分布式机器学习算法库[J].集成技术,2025,14(3):24-37,14.

基金项目

深圳市基础研究重点项目(JCYJ20220818100205012) (JCYJ20220818100205012)

广东省自然科学基金面上项目(2023A1515011667) (2023A1515011667)

深圳市科技重大专项项目(KJZD20230923114809020) This work is supported by Key Basic Research Foundation of Shenzhen(JCYJ20220818100205012),Natural Science Foundation of Guangdong Province(2023A1515011667),Science and Technology Major Project of Shenzhen(KJZD20230923114809020) (KJZD20230923114809020)

集成技术

2095-3135

访问量0
|
下载量0
段落导航相关论文