| 注册
首页|期刊导航|电子科技大学学报|语音驱动说话数字人视频生成方法综述

语音驱动说话数字人视频生成方法综述

刘颖 李济廷 柴瑞坤 位纪伟 杨阳

电子科技大学学报2024,Vol.53Issue(6):911-921,11.
电子科技大学学报2024,Vol.53Issue(6):911-921,11.DOI:10.12178/1001-0548.2024156

语音驱动说话数字人视频生成方法综述

A Review on Audio-Driven Digital Human Generation Methods

刘颖 1李济廷 1柴瑞坤 2位纪伟 2杨阳2

作者信息

  • 1. 军事科学院军队政治工作研究院,北京 100166
  • 2. 电子科技大学计算机科学与工程学院,成都 611731
  • 折叠

摘要

Abstract

In recent years,the rapid development of deep learning technology has greatly promoted the progress of virtual digital human technology,especially in the area of audio-driven digital human video generation.Research in this field has shown broad application prospects in various scenarios such as video translation,film production,and virtual assistants.The current methods and research status of audio-driven digital human video generation are sorted out and summarized in this paper,focusing on the key technologies,datasets,and evaluation strategies.In terms of key technologies,artificial intelligence technologies such as generative adversarial networks,diffusion models,and neural radiance fields have all played an important role.The scale and diversity of datasets are crucial for model training,and the improvement of evaluation strategies helps to evaluate the generation effect more objectively.The technology of audio-driven digital human video generation will continue to face numerous challenges and opportunities.It is expected that this field can continue to innovate and develop,bringing more convenience and fun to human society.

关键词

说话数字人/视频生成/生成对抗模型/扩散模型/神经辐射场/多模态融合

Key words

audio-driven digital human/video generation/generative adversarial network/diffusion model/neural radiance field/multimodal fusion

分类

信息技术与安全科学

引用本文复制引用

刘颖,李济廷,柴瑞坤,位纪伟,杨阳..语音驱动说话数字人视频生成方法综述[J].电子科技大学学报,2024,53(6):911-921,11.

基金项目

国家自然科学基金(62306067) (62306067)

电子科技大学学报

OA北大核心CSTPCD

1001-0548

访问量0
|
下载量0
段落导航相关论文