| 注册
首页|期刊导航|心理学报|人工智能的语言优势和不足:基于大语言模型与真实学生语文能力的比较

人工智能的语言优势和不足:基于大语言模型与真实学生语文能力的比较

高承海 党宝宝 王冰洁 吴胜涛

心理学报2025,Vol.57Issue(6):947-966,中插4-中插10,27.
心理学报2025,Vol.57Issue(6):947-966,中插4-中插10,27.DOI:10.3724/SP.J.1041.2025.0947

人工智能的语言优势和不足:基于大语言模型与真实学生语文能力的比较

The linguistic strength and weakness of artificial intelligence:A comparison between Large Language Model(s)and real students in the Chinese context

高承海 1党宝宝 1王冰洁 2吴胜涛3

作者信息

  • 1. 西北师范大学西北少数民族教育发展研究中心||西北师范大学教育科学学院,兰州 730070
  • 2. 西北师范大学心理学院,兰州 730070
  • 3. 厦门大学社会与人类学院,厦门 361005
  • 折叠

摘要

Abstract

Previous research on generative artificial intelligence(AI)has been primarily conducted in the English context,but it remains unclear about linguistic strength and weakness of generative AI in the Chinese context.This study focuses on the accuracy and normativity,affectivity,and creativity of AI in generating language knowledge,and explores its cultural adaptability and ability to generate humanized and personalized content.Evaluating and analyzing these key indicators helps us gain a deeper understanding of the linguistic strengths and weaknesses of AI,as well as cultivating the unique advantages of humans in education. By combining quantitative and qualitative methods,we evaluated the differences in knowledge accuracy,normativity,affectivity,and creativity between large language models and real students.Specifically,using an explanatory sequential design in the mixed-methods framework,we first tested group differences in each indicator among GPT-4 and ERNIE-4 versus real students on knowledge accuracy,normativity,affectivity,and creativity to test the.Next,through content analyses,we explored the specific performance of large language models on each indicator and the mechanism of their linguistic strengths and weaknesses. Study 1 found that compared to real students,GPT-4 exhibited higher accuracy in modern text knowledge(especially conceptual knowledge),but lower accuracy in ancient poetry and language usage.The knowledge normativity of GPT-4 were comparable to those of real students,while its affectivity and creativity were lower than those of real students.Moreover,the highest individual scores of GPT-4 in normativity and emotionality were on comparable with the highest scores of real students.Study 2,based on ERNIE-4,confirmed the aforementioned results,and the accuracy in ancient poetry was still lower than that of real students.The results exhibited the advantages of artificial intelligence in the areas of modern knowledge and norms,its shortcomings in ancient poetry knowledge,and its potential in affective and creative expressions. Taken together,the current findings demonstrate the linguistic strength of generative AI in the knowledge accuracy of modern Chinese literary,and the weakness regarding ancient Chinese poetry and affective and creative writings,as well as generative AI's potential in normative and affective expressions.This sheds light on the field of the cultural adaptability,affective and creative expressions of generative AI,and has valuable implications for the AI-assistant teaching practice in the Chinese context.

关键词

大语言模型/语文能力/准确性/情感性/创造性

Key words

large language models/language proficiency/accuracy/emotionality/creativity

分类

社会科学

引用本文复制引用

高承海,党宝宝,王冰洁,吴胜涛..人工智能的语言优势和不足:基于大语言模型与真实学生语文能力的比较[J].心理学报,2025,57(6):947-966,中插4-中插10,27.

基金项目

国家社科基金重大项目(24&ZD189)支持. (24&ZD189)

心理学报

OA北大核心

0439-755X

访问量0
|
下载量0
段落导航相关论文