计算机工程2025,Vol.51Issue(2):365-374,10.DOI:10.19678/j.issn.1000-3428.0068639
基于TD3算法的多智能体协作缓存策略
Multi-agent Cooperative Caching Strategy Based on TD3 Algorithm
摘要
Abstract
To reduce content-acquisition delay and transmission overhead in mobile edge networks,a Multi-Agent Cooperative Caching algorithm(MACC)based on a Twin Delayed Deep Deterministic(TD3)policy gradient is proposed.First,a multi-agent edge cache model is constructed and the multinode cache-replacement problem is modeled as a Partially Observable Markov Decision Process(POMDP).The cache state and content request information of adjacent nodes are integrated into the observation space of each node to improve the agent's ability to perceive the environment,and the prevalence characteristics of each node's content request are extracted using the triple exponential smoothing method.This algorithm can adapt to changes in the content popularity and improve the cache hit rate.Subsequently,a guiding reward function is designed by combining the transmission delay and overhead of local and adjacent nodes to guide agents to cooperate in caching,thereby reducing the cache redundancy and content transmission overhead of the system.Finally,the Wolpertinger architecture method is combined to extend the TD3 algorithm with multiple agents such that each edge node can learn the cache strategy adaptively to improve the system performance.Experimental results show that the edge nodes in the MACC algorithm use a portion of the cache space to help neighboring nodes cache request content and improve the cache hit rate.Compared with the MAAC,DDPG,and independent TD3 algorithms on the same dataset,the cache hit rate of the MACC algorithm improved by 8.50%,13.91%,and 29.21%,respectively.The proposed algorithm can adapt to a dynamic edge environment to reduce the content-acquisition delay and transmission overhead.关键词
移动边缘网络/多智能体/协作缓存/深度强化学习/TD3算法Key words
mobile edge network/multi-agent/cooperative cache/deep reinforcement learning/TD3 algorithm分类
信息技术与安全科学引用本文复制引用
曾建州,李泽平,张素勤..基于TD3算法的多智能体协作缓存策略[J].计算机工程,2025,51(2):365-374,10.基金项目
国家自然科学基金(61462014). (61462014)