This release brings better support for model optimization: add auto batch size matching for comparison operators (#432, #433) add a default criterion_update_fn for optim.model_optimization (former optim.default_transformer_optim_lopp, see below) (#460) add support for supervised datasets in model optimization (#461) add usage example for model optimization (#462, #463) Additionally, Enable to pass an optimizer to pystiche.optim.image_optimization (#431) Split handling of multi-layer encoders into a separate class (#438) Finally. this release marks the last beta release. In the future pystiche will be trimmed of funtionalities that can be handled by other specialized librarys. An example would be replacing pystiche.optim.log with tqdm or...
Neural state-of-the-art sequence-to-sequence (seq2seq) models often do not perform well for small tr...
It has been intensively investigated that the local shape, especially flatness, of the loss landscap...
Training Deep Neural Networks is complicated by the fact that the distribution of each layer’s input...
It is finally here: the first stable release of pystiche :tada: Logo It have been quite some (intern...
New model architectures: CTRL, DistilGPT-2 Two new models have been added since release 2.0. CTRL (...
Batch Normalization (BatchNorm) is a technique that enables the training of deep neural networks, es...
New model architecture: DistilBERT Adding Huggingface's new transformer architecture, DistilBERT des...
FlauBERT, MMBT MMBT was added to the list of available models, as the first multi-modal model to ma...
New model architectures: ALBERT, CamemBERT, GPT2-XL, DistilRoberta Four new models have been added i...
Trainer & TFTrainer Version 2.9 introduces a new Trainer class for PyTorch, and its equivalent TFTra...
Packages to encode Machine Learned models into optimization problems is an underdeveloped area, desp...
Version 2.0.0 Change models Update entire the library based on Optimizer class: Add class Problem a...
Raw Data for: Scalable Co-Optimization of Morphology and control in Embodied Machines Trials from N...
An unprecedented booming has been witnessed in the research area of artistic style transfer ever sin...
Despite the significant success of deep learning in computer vision tasks, cross-domain tasks still ...
Neural state-of-the-art sequence-to-sequence (seq2seq) models often do not perform well for small tr...
It has been intensively investigated that the local shape, especially flatness, of the loss landscap...
Training Deep Neural Networks is complicated by the fact that the distribution of each layer’s input...
It is finally here: the first stable release of pystiche :tada: Logo It have been quite some (intern...
New model architectures: CTRL, DistilGPT-2 Two new models have been added since release 2.0. CTRL (...
Batch Normalization (BatchNorm) is a technique that enables the training of deep neural networks, es...
New model architecture: DistilBERT Adding Huggingface's new transformer architecture, DistilBERT des...
FlauBERT, MMBT MMBT was added to the list of available models, as the first multi-modal model to ma...
New model architectures: ALBERT, CamemBERT, GPT2-XL, DistilRoberta Four new models have been added i...
Trainer & TFTrainer Version 2.9 introduces a new Trainer class for PyTorch, and its equivalent TFTra...
Packages to encode Machine Learned models into optimization problems is an underdeveloped area, desp...
Version 2.0.0 Change models Update entire the library based on Optimizer class: Add class Problem a...
Raw Data for: Scalable Co-Optimization of Morphology and control in Embodied Machines Trials from N...
An unprecedented booming has been witnessed in the research area of artistic style transfer ever sin...
Despite the significant success of deep learning in computer vision tasks, cross-domain tasks still ...
Neural state-of-the-art sequence-to-sequence (seq2seq) models often do not perform well for small tr...
It has been intensively investigated that the local shape, especially flatness, of the loss landscap...
Training Deep Neural Networks is complicated by the fact that the distribution of each layer’s input...