IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation

Sarti, Gabriele
Nissim, Malvina

Open PDF

Open link

Publication date

March 2022

Publisher

Center for Open Science

Language

English

Abstract

The T5 model and its unified text-to-text paradigm contributed in advancing the state-of-the-art for many natural language processing tasks. While some multilingual variants of the T5 model have recently been introduced, their performances were found to provide suboptimal performances for languages other than English if compared to monolingual variants. We are motivated by these findings to introduce IT5, the first family of encoder-decoder transformer models pretrained specifically on Italian. We perform a thorough cleaning of a web-crawled Italian corpus including more than 40 billion words and use it to pretrain three IT5 models of different sizes. The performance of IT5 models and their multilingual counterparts is then evaluated on a b...

Extracted data

We use cookies to provide a better user experience.

Data Protection

IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation

Abstract

Extracted data

IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation

Abstract

Extracted data

Topics

Related items

Topics

Related items