| 注册
首页|期刊导航|广东工业大学学报|基于深度强化学习的对话策略网络

基于深度强化学习的对话策略网络

邹志彬 郝志峰 许柏炎

广东工业大学学报2025,Vol.42Issue(5):105-111,7.
广东工业大学学报2025,Vol.42Issue(5):105-111,7.DOI:10.12052/gdutxb.240052

基于深度强化学习的对话策略网络

Deep Reinforcement Learning Based Dialogue Policy Network

邹志彬 1郝志峰 2许柏炎1

作者信息

  • 1. 广东工业大学 计算机学院,广东 广州 510006
  • 2. 广东工业大学 计算机学院,广东 广州 510006||汕头大学 工学院,广东 汕头 515063
  • 折叠

摘要

Abstract

Dialogue systems are an integral part of human-computer interaction,and their goal is to achieve a natural,intelligent,and smooth dialog experience.However,traditional dialog systems face challenges in understanding user intent,generating reasonable responses,and maintaining dialog coherence.The core reason is the mismatch between the supervised training of the dialog system strategy network and the final evaluation metrics,and the objective function fails to cover the critical metrics required by the system.Some recent research work has begun to explore reinforcement learning-oriented training;however,the existing work still suffers from the problems of relying on artificially defined templates and too many dialog rounds.In order to solve the above problems,this study employs a Transformer-based dialog strategy network to construct a word-level deep reinforcement learning training process,which improves the interactive performance of the dialog system by combining the multifaceted metrics of the dialog system to construct rewards.The experimental results show that relative to the traditional supervised training method,the policy network for deep reinforcement learning has an improvement in the dialog performance metrics BLEU and Comb,and maintains comparable performance in other metrics.The contribution of this study is to apply reinforcement learning techniques to the training process of policy networks for dialog systems,and its effectiveness and potential are verified through experiments.

关键词

对话系统/强化学习/对话策略学习

Key words

dialogue system/reinforcement learning/dialogue policy learning

分类

信息技术与安全科学

引用本文复制引用

邹志彬,郝志峰,许柏炎..基于深度强化学习的对话策略网络[J].广东工业大学学报,2025,42(5):105-111,7.

基金项目

国家自然科学基金资助项目(61876043,61976052,62206064) (61876043,61976052,62206064)

广东工业大学学报

1007-7162

访问量0
|
下载量0
段落导航相关论文