东南大学学报(英文版)2021,Vol.37Issue(2):164-170,7.DOI:10.3969/j.issn.1003-7985.2021.02.005
基于线性注意力和类Transformer模型的语音情感识别
Transformer-like model with linear attention for speech emotion recognition
摘要
Abstract
Because of the excellent performance of Transformer in sequence learning tasks,such as natural language processing,an improved Transformer-like model is proposed that is suitable for speech emotion recognition tasks.To alleviate the prohibitive time consumption and memory footprint caused by softmax inside the multihead attention unit in Transformer,a new linear self-attention algorithm is proposed.The original exponential function is replaced by a Taylor series expansion formula.On the basis of the associative property of matrix products,the time and space complexity of softmax operation regarding the input's length is reduced from O (N2) to O (N),where N is the sequence length.Experimental results on the emotional corpora of two languages show that the proposed linear attention algorithm can achieve similar performance to the original scaled dot product attention,while the training time and memory cost are reduced by half.Furthermore,the improved model obtains more robust performance on speech emotion recognition compared with the original Transformer.关键词
Transformer/注意力机制/语音情感识别/快速softmaxKey words
transformer/attention mechanism/speech emotion recognition/fast softmax分类
信息技术与安全科学引用本文复制引用
杜静,唐曼婷,赵力..基于线性注意力和类Transformer模型的语音情感识别[J].东南大学学报(英文版),2021,37(2):164-170,7.基金项目
The National Key Research and Development Pro-gram of China (No.2020YFC2004002,2020YFC2004003),the Na-tional Natural Science Foundation of China (No.61871213,61673108,61571106). (No.2020YFC2004002,2020YFC2004003)