计算机工程与应用2018,Vol.54Issue(3):131-135,5.DOI:10.3778/j.issn.1002-8331.1608-0332
简化LSTM的语音合成
Speech synthesis using simplified LSTM
摘要
Abstract
Conventional parametric speech synthesis approach using hidden Markov model can hardly obtain significant improvement when trained with large scale data. As Long Short-Term Memory(LSTM)is designed to take full account of the long-term sequence features, it dynamically produces an output respecting on the input and its internal status, which brings more accuracy and smoothness in sequential prediction. However, its large computation is still tailorable. In this paper, LSTM is simplified by removing the forget gate and output gate, and then models the relationship between syllable and its cepstral on a Chinese speech data set. Both training and prediction time decrease by half while Mel cepstral distortion goes down from HMM's 3.4661 to 1.9459.关键词
参数化语音合成/神经网络/长短期记忆神经网络Key words
parametric speech synthesis/neural network/Long Short-Term Memory(LSTM)分类
信息技术与安全科学引用本文复制引用
陈宙斯,胡文心..简化LSTM的语音合成[J].计算机工程与应用,2018,54(3):131-135,5.基金项目
国家科技支撑项目(No.2015BAH01F02). (No.2015BAH01F02)