| 注册
首页|期刊导航|中国科学院大学学报|一种基于long short-term memory的唇语识别方法

一种基于long short-term memory的唇语识别方法

马宁 田国栋 周曦

中国科学院大学学报2018,Vol.35Issue(1):109-117,9.
中国科学院大学学报2018,Vol.35Issue(1):109-117,9.DOI:10.7523/j.issn.2095-6134.2018.01.015

一种基于long short-term memory的唇语识别方法

A lip-reading recognition approach based on long short-term memory

马宁 1田国栋 2周曦2

作者信息

  • 1. 中国科学院大学,北京100049
  • 2. 中国科学院重庆绿色智能技术研究院,重庆400714
  • 折叠

摘要

Abstract

Visual speech information is the important carrier of conversation.However,visual speech informations from different speakers are different due to various appearances of lips,various backgrounds,and various talking ways even the content of the conversation is the same.To address the problem of variety of visual speech information,we propose a new approach for lip-reading recognition based on long short-term memory (LSTM).We compute the positions of lip landmarks which describe the dynamic information of the shape as the features of the lip-reading video,and it has the characteristics of within-class consistency and between-class distinctiveness.Then we use LSTM to encode temporal information,and it learns spatio-temporal features which have the ability of discrimination and generalization.Our approach is evaluated on three public databases (GRID,MRIALC,and OuluVS) for lip-reading recognition of isolated words or phrases in speaker independent experiments.On GRID and MRIALC,the accuracy of our approach is more than 30% highter than that of the conventional approach.On OuluVS,the accuracy of our approach is comparable to state of the art.The experiment results indicate that our lip-reading recognition approach solves the problem of variety of visual speech information effectively.

关键词

唇语识别/long short-term memory/计算机视觉

Key words

lip-reading recognition/long short-term memory/computer vision

分类

信息技术与安全科学

引用本文复制引用

马宁,田国栋,周曦..一种基于long short-term memory的唇语识别方法[J].中国科学院大学学报,2018,35(1):109-117,9.

基金项目

国家自然科学基金(61472386,61502444)和中国科学院战略性先导科技专项子课题(XDA06040103)资助 (61472386,61502444)

中国科学院大学学报

OA北大核心CSCDCSTPCD

2095-6134

访问量0
|
下载量0
段落导航相关论文