基于行为克隆的高通量卫星通信频谱资源分配OA北大核心CSTPCD
Spectrum resource allocation for high-throughput satellite communications based on behavior cloning
为应对在高通量多波束卫星系统中,随着波束数量和用户规模的扩大,频谱资源分配问题的维度急剧增加和求解复杂度呈指数级上升这一挑战,提出了一种结合行为克隆与深度强化学习的两阶段算法.第一阶段基于行为克隆,利用已有卫星资源分配决策数据对策略网络进行预训练,通过模仿专家行为减少盲目探索,加快算法收敛.第二阶段基于近端策略优化,进一步优化策略网络,并通过引入卷积注意力模块有效地提取用户业务状态特征,以提升算法整体性能.仿真结果表明,所提算法在收敛速度和算法稳定性方面均优于其他基准算法,并在系统时延、系统平均满意度和频谱效率等性能指标上表现更佳.
In high-throughput multi-beam satellite systems,the dimensionality of the spectrum resource allocation prob-lem increased drastically with the number of satellite beams and service users,which caused an exponential rise in the complexity of the solution.To address the challenge,a two-stage algorithm that combined behavior cloning(BC)with deep reinforcement learning(DRL)was proposed.In the first stage,the strategy network was pretrained using existing decision data from satellite operation through behavior cloning,which mimicked expert behavior to reduce blind explora-tion and accelerate algorithm convergence.In the second stage,the strategy network was further optimized using the proximal policy optimization(PPO),and a convolutional block attention module(CBAM)was employed to better extract the user traffic features,thereby enhancing overall algorithm performance.Simulation results demonstrate that the pro-posed algorithm outperforms the benchmark algorithms in terms of convergence speed and algorithm stability,and also delivers superior performance in system delay,average system satisfaction,and spectrum efficiency.
秦浩;李双益;赵迪;孟昊炜;宋彬
西安电子科技大学空天地一体化综合业务网全国重点实验室,陕西 西安 710071||西安电子科技大学杭州研究院,浙江 杭州 311200西安电子科技大学空天地一体化综合业务网全国重点实验室,陕西 西安 710071
电子信息工程
高通量卫星行为克隆深度强化学习近端策略优化卷积注意力模块
high-throughput satellitebehavior cloningdeep reinforcement learningproximal policy optimizationcon-volutional block attention module
《通信学报》 2024 (005)
101-114 / 14
国家自然科学基金资助项目(No.62071354,No.62201419);陕西省重点研发计划基金资助项目(No.2022ZDLGY05-08) The National Natural Science Foundation of China(No.62071354,No.62201419),The Key Research and Devel-opment Program of Shaanxi Province(No.2022ZDLGY05-08)
评论