河北科技大学学报2024,Vol.45Issue(2):150-158,9.DOI:10.7535/hbkd.2024yx02005
基于监督学习的HSK阅读文本自动分级模型研究
Research on automatic grading model of HSK reading texts based on supervised learning
摘要
Abstract
Aiming at the problem that there are few effective reference standards and analysis tools available in classifying and grading Hanyu Shuiping Kaoshi(HSK)reading materials,with HSK reading texts in the past years as study object,the text readability features were extracted,and nine supervised learning algorithms,such as support vector machine,decision tree and extreme gradient enhancement,etc.,were employed to build a model that could automatically classify self-selected text to the corresponding HSK level.Multiple indicators such as accuracy and AUC were adopted to evaluate the grading effect of each model,and the best model was chosen to design an online tool.The results show that supervised learning has high performance in analyzing and grading HSK reading materials.Among the nine supervised learning models,extreme gradient enhancement is the best,with an accuracy of 0.913 and an AUC of 0.994.The grading model and online tool can grade HSK self-selected texts with high accuracy,help users select texts pertinently and improve learning efficiency.关键词
自然语言处理/监督学习/HSK阅读文本/可读性特征/分级模型Key words
natural language processing/supervised learning/HSK reading text/readability feature/grading model分类
计算机与自动化引用本文复制引用
任梦,王方伟..基于监督学习的HSK阅读文本自动分级模型研究[J].河北科技大学学报,2024,45(2):150-158,9.基金项目
国家自然科学基金(61572170) (61572170)
河北师范大学2023年度人文社会科学校内科研基金(S23AI001) (S23AI001)