首页|期刊导航|无线电通信技术|基于离线强化学习的自动协商方法

基于离线强化学习的自动协商方法

陈锶奇熊钊远汪云飞王昊杨

无线电通信技术2026，Vol.52Issue(1)：1-14,14.

无线电通信技术2026，Vol.52Issue(1)：1-14,14.DOI:10.3969/j.issn.1003-3114.2026.01.001

基于离线强化学习的自动协商方法

An Offline Reinforcement Learning Based Automated Negotiation Approach

陈锶奇 ¹熊钊远 ²汪云飞 ¹王昊杨¹

作者信息

1. 重庆交通大学信息科学与工程学院,重庆 400074
2. 重庆交通大学交通运输学院,重庆 400074
折叠

摘要

Abstract

Automated negotiation is a key approach to achieving cooperation and collaboration in multi-agent systems.While Rein-forcement Learning(RL)-based negotiating agents have attained remarkable success across various scenarios,they still face constraints imposed by real-world implementation environments.In particular,these agents require extensive online interactions with opponents for training,which is often infeasible and unrealistic in practical applications.Therefore,a novel method is needed to enable the learning of effective negotiation strategies directly from offline datasets.Additionally,during subsequent online negotiations,opponents may al-ter their strategies due to various factors—such as changes in risk attitudes or market conditions—posing significant challenges to auto-mated negotiation.In this work,we propose a new negotiating agent that enhances performance via offline-to-online RL.The proposed agent is able ① to interact with opponents using an RL-based strategy to improve its adaptability to dynamic negotiation environments;② to learn negotiation strategies from historical offline data without the need for extensive online active interactions;and ③ to optimize the online fine-tuning process to facilitate rapid and stable performance improvements of the pre-learned offline strategies.Extensive experi-mental results are presented,based on multiple negotiation scenarios and winning agents from recent Automated Negotiating Agents Com-petitions(ANAC).The results demonstrate that the proposed agent outperforms state-of-the-art alternatives and remains effective even when opponents switch to different strategies.

关键词

自动协商/深度强化学习/智能体/电子商务

Key words

automated negotiation/deep reinforcement learning/agent/e-commerce

分类

信息技术与安全科学

引用本文复制引用

陈锶奇,熊钊远,汪云飞,王昊杨..基于离线强化学习的自动协商方法[J].无线电通信技术,2026,52(1):1-14,14.

基金项目

国家自然科学基金(61602391) （61602391）

天津市科技计划项目(22JCZDJC00580)National Natural Science Foundation of China(61602391) （22JCZDJC00580）

Tianjin Science and Technology Plan Project(22JCZDJC00580) （22JCZDJC00580）

无线电通信技术

ISSN：1003-3114

访问量1

下载量0

段落导航