Artificial neural networks represent an HPC workload with increasing importance. In particular the field of Natural Language Processing (NLP) has been undergoing a revolution in recent years. The training of ever larger language models, such as GPT-3, demands large HPC resources and has the potential to greatly impact everyday technology. The OpenGPT-X project was established in 2022 and aims to not leave this field to large tech companies but to provide an open, publicly funded alternative based on European values. The Jülich Supercomputing Centre is a consortium partner providing HPC infrastructure for the pre-training of the models. We research the optimization potential in the training process for example by using novel accelerator arch...
The crystallization of modeling methods around the Transformer architecture has been a boon for prac...
ChatGPT (Chat generative Pre-Trained Transformer) is a large artificial intelligence language model ...
Despite achieving state-of-the-art performance on many NLP tasks, the high energy cost and long infe...
Abstract. One of the major research trends currently is the evolution of heterogeneous parallel comp...
The field of Natural Language Processing (NLP) has been undergoing a revolution in recent years. Lar...
This report documents the program and the outcomes of Dagstuhl Seminar 22232 “Efficient and Equitabl...
This thesis aims to investigate the feasibility of generating code in high-performance computing lan...
MPI-learn and MPI-opt are libraries to perform large-scale training and hyper-parameter optimization...
The field of Natural Language Processing (NLP) has seen significant advancements in recent years, th...
In 2022, the launch of ChatGPT reinvigorated the debate about the use of artificial intelligence, ma...
In recent years, the number of parameters of one deep learning (DL) model has been growing much fast...
This report documents the program and the outcomes of Dagstuhl Seminar 22232 "Efficient and Equitabl...
OpenAI’s continuous efforts to push our knowledge and build on it for natural language processing ha...
In recent years, proficiency in data science and machine learning (ML) became one of the most reques...
Code and scripts accompanying the SuperComputing 2021 paper "Efficient Large-Scale Language Model Tr...
The crystallization of modeling methods around the Transformer architecture has been a boon for prac...
ChatGPT (Chat generative Pre-Trained Transformer) is a large artificial intelligence language model ...
Despite achieving state-of-the-art performance on many NLP tasks, the high energy cost and long infe...
Abstract. One of the major research trends currently is the evolution of heterogeneous parallel comp...
The field of Natural Language Processing (NLP) has been undergoing a revolution in recent years. Lar...
This report documents the program and the outcomes of Dagstuhl Seminar 22232 “Efficient and Equitabl...
This thesis aims to investigate the feasibility of generating code in high-performance computing lan...
MPI-learn and MPI-opt are libraries to perform large-scale training and hyper-parameter optimization...
The field of Natural Language Processing (NLP) has seen significant advancements in recent years, th...
In 2022, the launch of ChatGPT reinvigorated the debate about the use of artificial intelligence, ma...
In recent years, the number of parameters of one deep learning (DL) model has been growing much fast...
This report documents the program and the outcomes of Dagstuhl Seminar 22232 "Efficient and Equitabl...
OpenAI’s continuous efforts to push our knowledge and build on it for natural language processing ha...
In recent years, proficiency in data science and machine learning (ML) became one of the most reques...
Code and scripts accompanying the SuperComputing 2021 paper "Efficient Large-Scale Language Model Tr...
The crystallization of modeling methods around the Transformer architecture has been a boon for prac...
ChatGPT (Chat generative Pre-Trained Transformer) is a large artificial intelligence language model ...
Despite achieving state-of-the-art performance on many NLP tasks, the high energy cost and long infe...