Abstract: We are going to show the application of neural networks in one of the critical modules of a text-to-speech system: duration modeling. Our objective is the accurate prediction of segmental duration. We present a complete environment in which to decide which are the most relevant parameters and how to code them. We will compare two systems: an unrestricted-domain database for a male voice and a restricted-domain environment for a female voice. The restricted-domain offers several advantages to the modeling: the variation in the different patterns is reduced, and so most of the decisions about the parameters are based in more significant results. The result is a very low prediction error, specially in the restricted-domain case
Naturalness of synthetic speech highly depends on appropriate modeling of prosodic aspects. Mostly, ...
Acoustic analysis and synthesis experiments have shown that duration and intonation patterns are the...
Text-to-Speech (TTS) synthesis can be regarded as the automatic transformation of sentences from the...
The objective of this paper is the accurate prediction of segmental duration in a Spanish text-to-sp...
In this paper we model the segmental duration of Spanish spoken in Buenos Aires, considering its app...
The results of two alternative models to predict segmental durations in speech synthesis, both based...
The results of two alternative models to predict segmental durations in speech synthesis, both based...
This paper presents a segmental durations’ model applied to the European Portuguese language for TTS...
This paper presents a segmental duration model, that, as far as the authors know, is the first publi...
The Phoneme Dedicated Artificial Neural Network (PDANN) segmental duration model consists of a set o...
In this paper we present a condensed description of a European Portuguese segmental duration’s model...
In this paper, we propose a neural network model for predicting the durations of syllables. A four l...
Naturalness of synthetic speech highly depends on appropriate modelling of prosodic aspects. Mostly,...
AbstractThe main criterion in duration modeling is to model the duration pattern of the natural spee...
r uite considerably from those observable in fluent speech. The latter involves co mplex tempora...
Naturalness of synthetic speech highly depends on appropriate modeling of prosodic aspects. Mostly, ...
Acoustic analysis and synthesis experiments have shown that duration and intonation patterns are the...
Text-to-Speech (TTS) synthesis can be regarded as the automatic transformation of sentences from the...
The objective of this paper is the accurate prediction of segmental duration in a Spanish text-to-sp...
In this paper we model the segmental duration of Spanish spoken in Buenos Aires, considering its app...
The results of two alternative models to predict segmental durations in speech synthesis, both based...
The results of two alternative models to predict segmental durations in speech synthesis, both based...
This paper presents a segmental durations’ model applied to the European Portuguese language for TTS...
This paper presents a segmental duration model, that, as far as the authors know, is the first publi...
The Phoneme Dedicated Artificial Neural Network (PDANN) segmental duration model consists of a set o...
In this paper we present a condensed description of a European Portuguese segmental duration’s model...
In this paper, we propose a neural network model for predicting the durations of syllables. A four l...
Naturalness of synthetic speech highly depends on appropriate modelling of prosodic aspects. Mostly,...
AbstractThe main criterion in duration modeling is to model the duration pattern of the natural spee...
r uite considerably from those observable in fluent speech. The latter involves co mplex tempora...
Naturalness of synthetic speech highly depends on appropriate modeling of prosodic aspects. Mostly, ...
Acoustic analysis and synthesis experiments have shown that duration and intonation patterns are the...
Text-to-Speech (TTS) synthesis can be regarded as the automatic transformation of sentences from the...