New model architectures: ALBERT, CamemBERT, GPT2-XL, DistilRoberta Four new models have been added since release 2.1.1. ALBERT (Pytorch & TF) (from Google Research and the Toyota Technological Institute at Chicago) released with the paper ALBERT: A Lite BERT for Self-supervised Learning of Language Representations, by Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut. CamemBERT (Pytorch) (from Facebook AI Research, INRIA, and La Sorbonne Université), as the first large-scale Transformer language model. Released alongside the paper CamemBERT: a Tasty French Language Model by Louis Martin, Benjamin Muller, Pedro Javier Ortiz Suarez, Yoann Dupont, Laurent Romary, Eric Villemonte de la Clergerie, Djame ...
Perceiver The Perceiver model was released in the previous version: Perceiver Eight new models are r...
In 2017, Vaswani et al. proposed a new neural network architecture named Transformer. That modern ar...
Transformer-based neural models are used in many AI applications. Training these models is expensive...
New model architectures: ALBERT, CamemBERT, GPT2-XL, DistilRoberta Four new models have been added i...
New model architecture: DistilBERT Adding Huggingface's new transformer architecture, DistilBERT des...
New model architectures: CTRL, DistilGPT-2 Two new models have been added since release 2.0. CTRL (...
FlauBERT, MMBT MMBT was added to the list of available models, as the first multi-modal model to ma...
Marian (@sshleifer) A new model architecture, MarianMTModel with 1,008+ pretrained weights is avail...
New Model: BART (added by @sshleifer) Bart is one of the first Seq2Seq models in the library, and ac...
Trainer & TFTrainer Version 2.9 introduces a new Trainer class for PyTorch, and its equivalent TFTra...
Rust tokenizers (@mfuntowicz, @n1t0 ) Tokenizers for Bert, Roberta, OpenAI GPT, OpenAI GPT2, Transf...
New class Pipeline (beta): easily run and use models on down-stream NLP tasks We have added a new cl...
Longformer Longformer (@ibeltagy) Longformer for QA (@patil-suraj + @patrickvonplaten) Longformer f...
T5 Model (@patrickvonplaten, @thomwolf ) T5 is a powerful encoder-decoder model that formats every N...
Name change: welcome Transformers Following the extension to TensorFlow 2.0, pytorch-transformers =...
Perceiver The Perceiver model was released in the previous version: Perceiver Eight new models are r...
In 2017, Vaswani et al. proposed a new neural network architecture named Transformer. That modern ar...
Transformer-based neural models are used in many AI applications. Training these models is expensive...
New model architectures: ALBERT, CamemBERT, GPT2-XL, DistilRoberta Four new models have been added i...
New model architecture: DistilBERT Adding Huggingface's new transformer architecture, DistilBERT des...
New model architectures: CTRL, DistilGPT-2 Two new models have been added since release 2.0. CTRL (...
FlauBERT, MMBT MMBT was added to the list of available models, as the first multi-modal model to ma...
Marian (@sshleifer) A new model architecture, MarianMTModel with 1,008+ pretrained weights is avail...
New Model: BART (added by @sshleifer) Bart is one of the first Seq2Seq models in the library, and ac...
Trainer & TFTrainer Version 2.9 introduces a new Trainer class for PyTorch, and its equivalent TFTra...
Rust tokenizers (@mfuntowicz, @n1t0 ) Tokenizers for Bert, Roberta, OpenAI GPT, OpenAI GPT2, Transf...
New class Pipeline (beta): easily run and use models on down-stream NLP tasks We have added a new cl...
Longformer Longformer (@ibeltagy) Longformer for QA (@patil-suraj + @patrickvonplaten) Longformer f...
T5 Model (@patrickvonplaten, @thomwolf ) T5 is a powerful encoder-decoder model that formats every N...
Name change: welcome Transformers Following the extension to TensorFlow 2.0, pytorch-transformers =...
Perceiver The Perceiver model was released in the previous version: Perceiver Eight new models are r...
In 2017, Vaswani et al. proposed a new neural network architecture named Transformer. That modern ar...
Transformer-based neural models are used in many AI applications. Training these models is expensive...