| 注册
首页|期刊导航|心理学报|基于大语言模型的自助式AI心理咨询系统构建及其效果评估

基于大语言模型的自助式AI心理咨询系统构建及其效果评估

黄峰 丁慧敏 李思嘉 韩诺 狄雅政 刘晓倩 赵楠 李林妍 朱廷劭

心理学报2025,Vol.57Issue(11):2022-2042,中插35,22.
心理学报2025,Vol.57Issue(11):2022-2042,中插35,22.DOI:10.3724/SP.J.1041.2025.2022

基于大语言模型的自助式AI心理咨询系统构建及其效果评估

Self-help AI psychological counseling system based on large language models and its effectiveness evaluation

黄峰 1丁慧敏 2李思嘉 3韩诺 4狄雅政 5刘晓倩 5赵楠 5李林妍 6朱廷劭5

作者信息

  • 1. 中国科学院心理研究所认知科学与心理健康全国重点实验室,北京 100101||中国科学院大学心理学系,北京 100049||香港城市大学计算学院数据科学系,香港 999077
  • 2. 中国人民大学教育学院,北京 100872||美国圣母大学心理学系,印第安纳州46556
  • 3. 香港大学社会科学学院社会工作及社会行政学系,香港 999077
  • 4. 北京师范大学文理学院心理系,广东珠海 519087||北京师范大学心理学部,应用实验心理北京市重点实验室,心理学国家级实验教学示范中心(北京师范大学),北京 100875
  • 5. 中国科学院心理研究所认知科学与心理健康全国重点实验室,北京 100101||中国科学院大学心理学系,北京 100049
  • 6. 香港城市大学计算学院数据科学系,香港 999077||香港城市大学赛马会动物医学及生命科学院传染病及公共卫生学系,香港 999077
  • 折叠

摘要

Abstract

The global prevalence of mental health issues,such as depression and anxiety,has become a significant public health challenge.Traditional mental health services face limitations in accessibility,affordability,and scalability.The emergence of large language models(LLMs)offers new opportunities for developing intelligent,self-help psychological counseling systems.However,optimizing LLMs for mental health applications presents unique challenges,including data scarcity and privacy concerns.This study aimed to address these challenges by constructing a self-help AI psychological counseling system using zero-shot learning and chain-of-thought prompting.It also evaluated the effectiveness of this established system in improving mental health outcomes among the general population.The research also explored the impact of AI anthropomorphizing on human-computer interaction outcomes in mental health interventions. The study comprised two parts.In Experiment 1,we constructed the AI counseling system based on the GPT-4o model.We first compared GPT-4o with two other LLMs(Claude 3 Opus and Yi-Large)using a test set of 12 common mental health topics covering interpersonal relationships,family issues,personal growth,and other categories.Three qualified psychological counselors evaluated the models' performance on normative quality,professionalism,emotional understanding and empathy,and consistency and coherence.We then optimized GPT-4o using chain-of-thought prompting and role instructions designed explicitly for psychological counseling scenarios.The optimized model was re-evaluated to assess improvements.In Experiment 2,we conducted a two-week randomized controlled trial with 202 participants from the general population who reported experiencing negative emotions or psychological distress but had not been diagnosed with severe mental health issues.Participants were randomly assigned to one of three experimental groups with varying degrees of AI anthropomorphizing(F:female counselor image and name,M:male counselor image and name,R:robot image without human name)or a control group(C:using unmodified GPT-4o).To ensure active participation,interactions with at least 10 dialogue rounds and spanning more than 10 minutes were considered valid for analysis.Mental health outcomes,including depression,anxiety,stress(measured by DASS-21),and loneliness(measured by SSL),were assessed at baseline(T1),the last two days of the one-week interaction(T2),and one week post-intervention(T3).Linear mixed-effects models were used to analyze the data,with simple effects analysis and Tukey HSD tests for post-hoc comparisons. In Experiment 1,GPT-4o significantly outperformed other models in normative quality,emotional understanding and empathy,and consistency and coherence(all p<0.001).After optimization with chain-of-thought prompting,the model showed further significant improvements across all evaluation dimensions(p<0.01),with huge effect sizes in normative quality(d=1.28),emotional understanding and empathy(d=1.06),and consistency and coherence(d=1.14).Professional competence showed more limited improvement(d=0.51),reflecting current technological limitations in this dimension.In Experiment 2,the attrition rate from T1 to T3 was 24.3%,with no significant differences in demographic characteristics or baseline mental health indicators between completers and non-completers.The interaction quality control retained 180 participants at T2(retention rate 89.11%)and 153 at T3(75.74%).All experimental groups showed significant short-term improvements in depression,anxiety,and loneliness at T2 compared to the control group(all p<0.001).For loneliness,anthropomorphized AI designs(F and M groups)demonstrated significantly greater effects than the non-anthropomorphized design(R group)at T2.For stress levels,a group × time interaction effect reached marginally significant(p=0.05),with only the non-anthropomorphized group(R group)showing substantial improvement from T1 to T2(b=2.35,SE=0.48,p<0.001).The improvement in anxiety symptoms persisted at T3 for all experimental groups(p<0.001),while effects on depression,stress,and loneliness did not maintain significance at follow-up. This study provides empirical evidence for the potential of AI-based self-help psychological counseling in improving mental health outcomes,particularly in reducing mental health symptoms in the short term.The successful application of zero-shot learning and chain-of-thought prompting in optimizing LLMs for mental health dialogues offers a novel approach to overcome challenges in data scarcity and model adaptation in specialized domains.The differential effects of AI anthropomorphization on various mental health indicators support a nuanced design framework:anthropomorphized designs may be more effective for addressing social functioning-related issues like loneliness through enhanced social presence.In contrast,non-anthropomorphized designs might better manage stress by reducing social evaluation pressure.However,the study also reveals significant limitations,including the lack of long-term effects for most outcomes and limited improvement in professional competence.Future research should focus on enhancing the long-term efficacy of AI-assisted mental health interventions,improving professional depth for specialized counseling scenarios,exploring human-AI collaborative models for high-risk cases,and further investigating the mechanisms underlying the differential effects of AI design features on specific mental health issues.These findings provide valuable insights for developing more effective,personalized AI-assisted mental health services to complement traditional care approaches.

关键词

人工智能/大语言模型/思维链/心理健康/自助式心理咨询/随机对照试验

Key words

artificial intelligence/large language models/chain-of-thought/mental health/self-help psychological counseling/randomized controlled trial

分类

基础医学

引用本文复制引用

黄峰,丁慧敏,李思嘉,韩诺,狄雅政,刘晓倩,赵楠,李林妍,朱廷劭..基于大语言模型的自助式AI心理咨询系统构建及其效果评估[J].心理学报,2025,57(11):2022-2042,中插35,22.

基金项目

北京市自然科学基金(IS23088)资助. (IS23088)

心理学报

OA北大核心

0439-755X

访问量0
|
下载量0
段落导航相关论文