首页|期刊导航|无线电工程|HMAPPO:无线供能边缘计算网络长时吞吐量最大化方法

HMAPPO:无线供能边缘计算网络长时吞吐量最大化方法

郭羽婕张志飞张煜刘彤熊轲

无线电工程2026，Vol.56Issue(2)：213-221,9.

无线电工程2026，Vol.56Issue(2)：213-221,9.DOI:10.3969/j.issn.1003-3106.2026.02.003

HMAPPO:无线供能边缘计算网络长时吞吐量最大化方法

HMAPPO:Long-term Throughput Maximization Method for Wireless Powered Edge Computing Networks

郭羽婕 ¹张志飞 ¹张煜 ²刘彤 ³熊轲¹

作者信息

1. 北京交通大学计算机科学与技术学院,北京 100044
2. 国网能源研究院有限公司,北京 102209
3. 北京市计算中心有限公司,北京 100094
折叠

摘要

Abstract

With the continuous deployment of Internet of Things(IoT)and 5G networks,the computational load and sustainable energy demand of edge sensor devices have increased significantly.By integrating Wireless Power Transfer(WPT)and Mobile Edge Computing(MEC),Wireless-Powered MEC(WP-MEC)provides a promising solution for extending the power supply lifetime of edge devices and enhancing overall system computing capability.However,previous works focus on single-time-slot resource optimization or single-cell network models,leading to low resource utilization efficiency and significant deviations from practical scenarios.To address this issue,the optimization of a multi-cell and multi-time-slot WP-MEC network based on Non-Orthogonal Multiple Access(NOMA)is focused by jointly optimizing energy transmission time,task offloading strategies,and power allocation,and energy accumulation gain to maximize the long-term system throughput is fully leveraged.To enable efficient resource scheduling in complex and dynamic networks,a Heterogeneous Multi-Agent Proximal Policy Optimization(HMAPPO)algorithm is proposed.By introducing a hierarchical structure with a controller agent and device agents,HMAPPO achieves cooperative optimization between global energy transfer time,local task offloading,and power allocation.Unlike value-function-based approaches such as Multi-Agent Soft Actor-Critic(MASAC)or Multi-Agent Twin Delayed Deep Deterministic Policy Gradient(MATD3),HMAPPO adopts a proximal policy optimization mechanism that constrains changes between successive policies.This makes it more suitable for multi-slot energy dynamics and continuous action spaces,thereby achieving higher training stability in WP-MEC networks.Simulation results demonstrate that the proposed algorithm achieves performance comparable to that of the centralized Proximal Policy Optimization(PPO)while realizing distributed optimization,with a performance gap of less than 3.3%.Moreover,the algorithm exhibits performance under varying conditions of different numbers of cells,devices,and device distances,verifying its superior generalization and scalability.

关键词

移动边缘计算/无线供能网络/非正交多址接入/长时吞吐量优化/多智能体强化学习

Key words

MEC/wireless powered network/NOMA/long-term throughout maximization/multi-agent reinforcement learning

分类

信息技术与安全科学

引用本文复制引用

郭羽婕,张志飞,张煜,刘彤,熊轲..HMAPPO:无线供能边缘计算网络长时吞吐量最大化方法[J].无线电工程,2026,56(2):213-221,9.

基金项目

国家自然科学基金(62571028) National Natural Science Foundation of China(62571028) （62571028）

无线电工程

ISSN：1003-3106

访问量0

下载量0

段落导航