摘要
Abstract
In order to explore the potential and limitations of Large Language Models(LLMs)in simulating human intelligence,particularly in debate capabilities,a framework called CoRAG-MAD was constructed that integrated Chain-of-Thought(CoT)and Retrieval Augmented Generation(RAG)techniques into Multi-Agent Debate(MAD).It was designed to simulate the process of human debating competition,including four stages:opening statements,attack and defence,free debate,and closing statements.It was employed in three distinct debate scenarios:fair debate,unequal debate,and mixed debate.By combining automated evaluation tools and human expert review,a thorough analysis of the debate content was conducted.The experiment,using the OrChiD dataset as the test platform,shows that CoRAG-MAD can effectively improve several abilities of LLMs in various debate scenarios.Specifically,in the unequality debate,LLMs'logical reasoning score improves up to 57.56%and creativity score improves up to 49.77%;in the mixed debate,LLMs'collaborative ability improves up to 23.36%,and overall performance improves up to 28.20%.This paper presented ablation and comparative experiments,which were conducted to verify the effectiveness of the CoT in enhancing logical reasoning,the RAG in enhancing factual accuracy and stimulating creative thinking,and the CoRAG approach in MAD.关键词
多代理辩论/检索增强生成/思维链/大语言模型/NLPKey words
multi-agent debate/retrieval-augmented generation/chain-of-thought/large language mod-els/NLP分类
信息技术与安全科学