| 注册
首页|期刊导航|智能科学与技术学报|仅用单一短音频训练的音效生成GAN模型

仅用单一短音频训练的音效生成GAN模型

姜林 苗向阳 洪紫荆

智能科学与技术学报2025,Vol.7Issue(4):468-483,16.
智能科学与技术学报2025,Vol.7Issue(4):468-483,16.DOI:10.11959/j.issn.2096-6652.202524

仅用单一短音频训练的音效生成GAN模型

A GAN generation method of sound effect using only a single short audio for training

姜林 1苗向阳 2洪紫荆3

作者信息

  • 1. 湘江实验室,湖南 长沙 410205||湖南工商大学人工智能与先进计算学院,湖南 长沙 410205
  • 2. 湖南工商大学智能工程与智能制造学院,湖南 长沙 410205
  • 3. 湖南工商大学人工智能与先进计算学院,湖南 长沙 410205
  • 折叠

摘要

Abstract

To address the issues of low audio realism and insufficient style diversity in sound effect generation,a genera-tive adversarial network(GAN)model based on a multi-band attention mechanism was proposed.Firstly,a multi-band ex-pansion mode was adopted to extract audio features at different sampling rates,and a relativistic average hinge GAN(Ra-HingeGAN)loss function was introduced to improve audio generation stability.Secondly,a Transformer attention mecha-nism was incorporated to enhance the expression of harmonic information and spectral structure,while an Alpha Dropout adaptive regularization layer was applied to mitigate overfitting.Finally,an audio style transfer module was designed to enhance style controllability.During the feature learning process,Mel frequency cepstral coefficient(MFCC),Gamma-tone frequency cepstrum coefficient(GFCC),and their higher-order differential coefficients were fused to capture the dy-namic characteristics of the audio signal.Experiments demonstrated that the proposed GAN model based on the multi-band attention mechanism outperformed existing models in terms of sound effect realism,style diversity,and generation stability,effectively improving the quality of sound effect generation.

关键词

音效生成/生成对抗网络/多频带/Transformer

Key words

sound effect generation/GAN/multi band/Transformer

分类

信息技术与安全科学

引用本文复制引用

姜林,苗向阳,洪紫荆..仅用单一短音频训练的音效生成GAN模型[J].智能科学与技术学报,2025,7(4):468-483,16.

基金项目

湘江实验室重大项目(No.23XJ01003,No.23XJ01009) (No.23XJ01003,No.23XJ01009)

湖南省教育厅科学研究重点项目(No.22A0441) (No.22A0441)

湖南省研究生科研创新项目(No.CX20231163) Major Project of Xiangjiang Laboratory(No.23XJ01003,No.23XJ01009),Major Scientific Research Project of Hunan Provincial Department of Education(No.22A0441),Hunan Provincial Innovation Foundation For Postgraduate(No.CX20231163) (No.CX20231163)

智能科学与技术学报

2096-6652

访问量0
|
下载量0
段落导航相关论文