首页|期刊导航|现代信息科技|基于VITS的高性能歌声转换模型

基于VITS的高性能歌声转换模型

周柯汝金伟

现代信息科技2025，Vol.9Issue(12)：129-133,140,6.

现代信息科技2025，Vol.9Issue(12)：129-133,140,6.DOI:10.19850/j.cnki.2096-4706.2025.12.025

基于VITS的高性能歌声转换模型

High-performance Singing Voice Conversion Model Based on VITS

周柯汝 ¹金伟¹

作者信息

1. 浙江中医药大学医学技术与信息工程学院,浙江杭州 310053
折叠

摘要

Abstract

Singing voice conversion is the process of transforming the voice of the source singer into that of the target singer while retaining the original content and melody.With the development of technology,various network architectures and models have been put forward one after another,and the algorithms for singing voice conversion have also become diversified.However,problems such as poor quality of the converted audio,high distortion rates,and lack of vocal range are bound to occur.This paper proposes UVC(Ultra Singing Voice Conversion)model with multi-decoupled feature constraints based on high-fidelity flow.This model is built on the basis of the VIT model.By combining the ContentVec encoder and the NSF-HIFI-GAN vocoder,it improves the input and output of the model,greatly enhancing the quality and fluency of the converted audio and possessing strong robustness.

关键词

歌声转换/VITS/ContentVec编码器/NSF-HIFI-GAN声码器

Key words

singing voice conversion/VITS/ContentVec encoder/NSF-HIFI-GAN vocoder

分类

信息技术与安全科学

引用本文复制引用

周柯汝,金伟..基于VITS的高性能歌声转换模型[J].现代信息科技,2025,9(12):129-133,140,6.

现代信息科技

ISSN：2096-4706

访问量2

下载量0

段落导航