| 注册
首页|期刊导航|生命科学仪器|基于自适应序列帧长度的端到端式唇语识别算法

基于自适应序列帧长度的端到端式唇语识别算法

吴威龙 李润恺 许霜烨 朱真

生命科学仪器2023,Vol.21Issue(4):35-39,5.
生命科学仪器2023,Vol.21Issue(4):35-39,5.DOI:10.11967/2023210805

基于自适应序列帧长度的端到端式唇语识别算法

An End-to-End Lip-Reading Recognition Algorithm Based on the Adaptive Length of Frame Sequence

吴威龙 1李润恺 1许霜烨 2朱真3

作者信息

  • 1. 东南大学#电子科学与工程学院,江苏南京 210096
  • 2. 东南大学#集成电路学院,江苏无锡 214000
  • 3. 东南大学#电子科学与工程学院,江苏南京 210096||东南大学#集成电路学院,江苏无锡 214000
  • 折叠

摘要

Abstract

The proposed lip-reading recognition provides a brand new idea for many emerging fields such as infor-mation security and assisted driving.Existing lip-reading recognition algorithms feature complex schemes,slow convergence rates,and difficulty to load into portable devices for end-to-end application.Based on convolutional neural network(CNN)and bidirectional long short-term memory(BLSTM)recurrent neural network,this work reports on an end-to-end neural network model for lip-reading recognition with adaptive sequence length.The algorithm assigns the location of feature regions in the video stream by Dlib feature point localization method and transfers them into the CNN neural network for pre-processing to obtain the switch signal,and then transfers the obtained switch signal into the BLSTM to control the length of frame sequence.The algorithm models the underly-ing temporal information between frames more fully and can be loaded into the end-to-end portable devices.The algorithm has been experimentally validated to achieve a recognition accuracy of 67.2%on the dataset MIR ACL-VC1,offering an absolute improvement of 11.2%to the previous state-of-the-art adaptive lip-reading recog-nition algorithm.

关键词

唇语识别/端到端/卷积神经网络/双向长短期记忆

Key words

Lip-reading recognition/end-to-end/convolutional neural network/bidirectional long short-term memory

分类

医药卫生

引用本文复制引用

吴威龙,李润恺,许霜烨,朱真..基于自适应序列帧长度的端到端式唇语识别算法[J].生命科学仪器,2023,21(4):35-39,5.

基金项目

国家重点研发计划(2021YFF0701002) (2021YFF0701002)

生命科学仪器

1671-7929

访问量0
|
下载量0
段落导航相关论文