基于多任务蒸馏的意图识别和槽位填充OA北大核心CSTPCD
Research on sentence intention recognition and slot filling based on multi-task distillation
BERT 等预训练模型在很多 NLP 任务上取得了良好的效果,但预训练模型参数规模大,运算量大,对硬件资源要求高,难以部署在小型的配置环境中.模型压缩是解决该问题的关键,知识蒸馏是目前较好的模型压缩方法.基于此,提出基于多任务蒸馏的句意图识别和槽位填充联合模型,该模型将 ALBERT 运用到任务型对话系统中,并利用知识蒸馏策略将 ALBERT 模型知识迁移到 BiLSTM 模型.实验结果表明,基于 ALBERT 的联合模型在 SMP 2019 评测数据集中的句准确率为 77.74%,单独训练的 BiLSTM 模型句准确率为 58.33%,而蒸馏模型的句准确率为67.22%,在比 BiLSTM 高 8.89%的情况下,推断速度约为ALBERT的 18.9 倍.
At present,pre-trained models such as BERT have achieved good results in many NLP tasks,but the pre-trained models are difficult to deploy in small configuration environments because of their large parameter scale,large computation and high requirements on hardware resources.Model compression is the key to solve this problem,and knowledge distillation is currently a better model compression method.A joint model of sentence intent recognition and slot filling based on multi-task distillation is proposed.The model applies ALBERT to task-based dialogue system,and uses the knowledge distillation strategy to migrate the ALBERT model knowledge to the BiLSTM model.Experimental results show that the sentence accuracy rate of the ALBERT based joint model in the SMP 2019 evaluation data set is 77.74%,the sentence accuracy rate of the BiLSTM model trained separately is 58.33%,and the sentence accuracy rate of the distillation model is 67.22%,which is 8.89%higher than the BiLSTM model while offering an inference speed approximately 18.9 times faster than ALBERT.
高子雄;蒋盛益;欧炎镁;禤镇宇
广东外语外贸大学 信息科学与技术学院/网络空间安全学院,广东 广州 510006
数学
意图识别与槽位填充神经网络知识蒸馏
intention recognition and slot fillingneural networkknowledge distillation
《陕西师范大学学报(自然科学版)》 2024 (003)
96-104 / 9
国家自然科学基金(61572145)
评论