We propose voice conversion model from arbitrary source speaker to arbitrary target speaker with disentangled representations. Voice conversion is a task to convert the voice of spoken utterance of source speaker to that of target speaker. Most prior work require to know either source speaker or target speaker or both in training, with either parallel or non-parallel corpus. Instead, we study the problem of voice conversion in nonparallel speech corpora and one-shot learning setting. We convert an arbitrary sentences of an arbitrary source speaker to target speakers given only one or few target speaker training utterances. To achieve this, we propose to use disentangled representations of speaker identity and linguistic context. We use a re...
In this paper, we present a voice conversion (VC) method that does not use any parallel data while t...
Voice Conversion (VC) aims to transform the speech of a source speaker to sound as if a target speak...
The recent advances in text-to-speech have been awe-inspiring, to the point of synthesizing near-hum...
Gburrek T, Ebbers J, Häb-Umbach R, Wagner P. Unsupervised Learning of a Disentangled Speech Represen...
Voice conversion (VC) transforms the speaking style of a source speaker to the speaking style of a t...
Voice conversion (VC) consists of digitally altering the voice of an individual to manipulate part o...
International audienceMuch existing voice conversion (VC) systems are attractive owing to their high...
Kuhlmann M, Seebauer FM, Ebbers J, Wagner P, Haeb-Umbach R. Investigation into Target Speaking Rate ...
Abstract Voice conversion is to transform a source speaker to the target one, while keeping the ling...
We propose a joint training scheme of an any-to-one voice conversion (VC) system with LPCNet to impr...
In this paper, we use artificial neural networks (ANNs) for voice conversion and exploit the mapping...
We present an any-to-one voice conversion (VC) system, using an autoregressive model and LPCNet voco...
The objective of voice conversion algorithms is to modify the speech by a particular source speaker ...
In this paper, we present a nonparallel voice conversion (VC) approach that does not require paralle...
In this paper, we propose the use of speaker embedding networks to perform zero-shot singing voice c...
In this paper, we present a voice conversion (VC) method that does not use any parallel data while t...
Voice Conversion (VC) aims to transform the speech of a source speaker to sound as if a target speak...
The recent advances in text-to-speech have been awe-inspiring, to the point of synthesizing near-hum...
Gburrek T, Ebbers J, Häb-Umbach R, Wagner P. Unsupervised Learning of a Disentangled Speech Represen...
Voice conversion (VC) transforms the speaking style of a source speaker to the speaking style of a t...
Voice conversion (VC) consists of digitally altering the voice of an individual to manipulate part o...
International audienceMuch existing voice conversion (VC) systems are attractive owing to their high...
Kuhlmann M, Seebauer FM, Ebbers J, Wagner P, Haeb-Umbach R. Investigation into Target Speaking Rate ...
Abstract Voice conversion is to transform a source speaker to the target one, while keeping the ling...
We propose a joint training scheme of an any-to-one voice conversion (VC) system with LPCNet to impr...
In this paper, we use artificial neural networks (ANNs) for voice conversion and exploit the mapping...
We present an any-to-one voice conversion (VC) system, using an autoregressive model and LPCNet voco...
The objective of voice conversion algorithms is to modify the speech by a particular source speaker ...
In this paper, we present a nonparallel voice conversion (VC) approach that does not require paralle...
In this paper, we propose the use of speaker embedding networks to perform zero-shot singing voice c...
In this paper, we present a voice conversion (VC) method that does not use any parallel data while t...
Voice Conversion (VC) aims to transform the speech of a source speaker to sound as if a target speak...
The recent advances in text-to-speech have been awe-inspiring, to the point of synthesizing near-hum...