Recent studies have explored the use of deep generative models of speech spectra based of variational autoencoders (VAEs), combined with unsupervised noise models, to perform speech enhancement. These studies developed iterative algorithms involving either Gibbs sampling or gradient descent at each step, making them computationally expensive. This paper proposes a variational inference method to iteratively estimate the power spectrogram of the clean speech. Our main contribution is the analytical derivation of the variational steps in which the encoder of the pre-learned VAE can be used to estimate the variational approximation of the true posterior distribution, using the very same assumption made to train VAEs. Experiments show that the ...
In previous work, we proposed a variational autoencoder-based (VAE) Bayesian permutation training sp...
Accepted to Interspeech 2021. arXiv admin note: text overlap with arXiv:2008.12595International audi...
International audienceRecently, an audiovisual speech generative model based on variational autoenco...
International audienceRecent studies have explored the use of deep generative models of speech spect...
Recent studies have explored the use of deep generative models of speech spectra based of variationa...
International audiencehis paper focuses on single-channel semi-supervised speech en-hancement...
International audienceThis paper presents a generative approach to speech enhancement based on a rec...
International audienceIn this paper, we are interested in unsupervised (unknown noise) speech enhanc...
International audienceDynamical variational autoencoders (DVAEs) are a class of deep generative mode...
Unsupervised speech enhancement based on variational autoencoders has shown promising performance co...
Unsupervised speech enhancement based on variational autoencoders has shown promising performance co...
Dynamical variational autoencoders (DVAEs) are a class of deep generative models with latent variabl...
Comunicació presentada al Interspeech 2016, celebrat a San Francisco (Califòrnia, EUA) els dies 8 a ...
International audienceVariational autoencoders (VAEs) are powerful (deep) generative artificial neur...
We address speech enhancement based on variational autoencoders, which involves learning a speech pr...
In previous work, we proposed a variational autoencoder-based (VAE) Bayesian permutation training sp...
Accepted to Interspeech 2021. arXiv admin note: text overlap with arXiv:2008.12595International audi...
International audienceRecently, an audiovisual speech generative model based on variational autoenco...
International audienceRecent studies have explored the use of deep generative models of speech spect...
Recent studies have explored the use of deep generative models of speech spectra based of variationa...
International audiencehis paper focuses on single-channel semi-supervised speech en-hancement...
International audienceThis paper presents a generative approach to speech enhancement based on a rec...
International audienceIn this paper, we are interested in unsupervised (unknown noise) speech enhanc...
International audienceDynamical variational autoencoders (DVAEs) are a class of deep generative mode...
Unsupervised speech enhancement based on variational autoencoders has shown promising performance co...
Unsupervised speech enhancement based on variational autoencoders has shown promising performance co...
Dynamical variational autoencoders (DVAEs) are a class of deep generative models with latent variabl...
Comunicació presentada al Interspeech 2016, celebrat a San Francisco (Califòrnia, EUA) els dies 8 a ...
International audienceVariational autoencoders (VAEs) are powerful (deep) generative artificial neur...
We address speech enhancement based on variational autoencoders, which involves learning a speech pr...
In previous work, we proposed a variational autoencoder-based (VAE) Bayesian permutation training sp...
Accepted to Interspeech 2021. arXiv admin note: text overlap with arXiv:2008.12595International audi...
International audienceRecently, an audiovisual speech generative model based on variational autoenco...