Privacy Preserving Distributed Bandit Residual Feedback Online Optimization Over Time-Varying Unbalanced GraphsOACSTPCDEI
Privacy Preserving Distributed Bandit Residual Feedback Online Optimization Over Time-Varying Unbalanced Graphs
This paper considers the distributed online opti-mization(DOO)problem over time-varying unbalanced net-works,where gradient information is explicitly unknown.To address this issue,a privacy-preserving distributed online one-point residual feedback(OPRF)optimization algorithm is pro-posed.This algorithm updates decision variables by leveraging one-point residual feedback to estimate the true gradient infor-mation.It can achieve the same performance as the two-point feedback scheme while only requiring a single function value query per iteration.Additionally,it effectively eliminates the effect of time-varying unbalanced graphs by dynamically constructing row stochastic matrices.Furthermore,compared to other dis-tributed optimization algorithms that only consider explicitly unknown cost functions,this paper also addresses the issue of pri-vacy information leakage of nodes.Theoretical analysis demon-strate that the method attains sublinear regret while protecting the privacy information of agents.Finally,numerical experi-ments on distributed collaborative localization problem and fed-erated learning confirm the effectiveness of the algorithm.
Zhongyuan Zhao;Zhiqiang Yang;Luyao Jiang;Ju Yang;Quanbo Ge
School of Automation,Nanjing University of Information Science and Technology||Jiangsu Key Laboratory of Big Data Analysis Technology,Nanjing 210044,ChinaSchool of Mechanical Engineering,Shanghai Jiao Tong University,Shanghai 200240,China
Differential privacydistributed online optimization(DOO)federated learningone-point residual feedback(OPRF)time-varying unbalanced graphs
《自动化学报(英文版)》 2024 (011)
2284-2297 / 14
This work was supported by the National Natural Science Foundation of China(62033010,U23B2061)and Qing Lan Project of Jiangsu Province(R2023Q07).
评论