| 注册
首页|期刊导航|中北大学学报(自然科学版)|大语言模型在多代理辩论中作为辩论者表现的比较分析

大语言模型在多代理辩论中作为辩论者表现的比较分析

张立炎 梁志剑

中北大学学报(自然科学版)2025,Vol.46Issue(2):219-229,11.
中北大学学报(自然科学版)2025,Vol.46Issue(2):219-229,11.DOI:10.62756/jnuc.issn.1673-3193.2024.08.0016

大语言模型在多代理辩论中作为辩论者表现的比较分析

A Comparative Analysis of Large Language Models as Debaters'Performance in Multi-Agent Debates

张立炎 1梁志剑1

作者信息

  • 1. 中北大学 计算机科学与技术学院,山西 太原 030051
  • 折叠

摘要

Abstract

In order to explore the potential and limitations of Large Language Models(LLMs)in simulating human intelligence,particularly in debate capabilities,a framework called CoRAG-MAD was constructed that integrated Chain-of-Thought(CoT)and Retrieval Augmented Generation(RAG)techniques into Multi-Agent Debate(MAD).It was designed to simulate the process of human debating competition,including four stages:opening statements,attack and defence,free debate,and closing statements.It was employed in three distinct debate scenarios:fair debate,unequal debate,and mixed debate.By combining automated evaluation tools and human expert review,a thorough analysis of the debate content was conducted.The experiment,using the OrChiD dataset as the test platform,shows that CoRAG-MAD can effectively improve several abilities of LLMs in various debate scenarios.Specifically,in the unequality debate,LLMs'logical reasoning score improves up to 57.56%and creativity score improves up to 49.77%;in the mixed debate,LLMs'collaborative ability improves up to 23.36%,and overall performance improves up to 28.20%.This paper presented ablation and comparative experiments,which were conducted to verify the effectiveness of the CoT in enhancing logical reasoning,the RAG in enhancing factual accuracy and stimulating creative thinking,and the CoRAG approach in MAD.

关键词

多代理辩论/检索增强生成/思维链/大语言模型/NLP

Key words

multi-agent debate/retrieval-augmented generation/chain-of-thought/large language mod-els/NLP

分类

信息技术与安全科学

引用本文复制引用

张立炎,梁志剑..大语言模型在多代理辩论中作为辩论者表现的比较分析[J].中北大学学报(自然科学版),2025,46(2):219-229,11.

中北大学学报(自然科学版)

1673-3193

访问量0
|
下载量0
段落导航相关论文