EPIC TTS Models: Empirical Pruning Investigations Characterizing Text-To-Speech Models

Lam, Perry
Zhang, Huayun
Chen, Nancy F.
Sisman, Berrak

Open PDF

Open link

Publication date

September 2022

DOI

10.21437/Interspeech.2022-10626

Publisher

International Speech Communication Association

Language

English

Abstract

Neural models are known to be over-parameterized, and recent work has shown that sparse text-to-speech (TTS) models can outperform dense models. Although a plethora of sparse methods has been proposed for other domains, such methods have rarely been applied in TTS. In this work, we seek to answer the question: what are the characteristics of selected sparse techniques on the performance and model complexity? We compare a Tacotron2 baseline and the results of applying five techniques. We then evaluate the performance via the factors of naturalness, intelligibility and prosody, while reporting model size and training time. Complementary to prior research, we find that pruning before or during training can achieve similar performance to prunin...

Extracted data

We use cookies to provide a better user experience.

Data Protection

EPIC TTS Models: Empirical Pruning Investigations Characterizing Text-To-Speech Models

Abstract

Extracted data

EPIC TTS Models: Empirical Pruning Investigations Characterizing Text-To-Speech Models

Abstract

Extracted data

Related items

Related items