LS-CAT: A Large-Scale CUDA AutoTuning Dataset

Lars, Bjertnes
Tørring, Jacob
Elster, Anne C.

Publication date

January 2021

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Abstract

The effectiveness of Machine Learning (ML) methods depend on access to large suitable datasets. In this article, we present how we build the LS-CAT (Large-Scale CUDA AutoTuning) dataset sourced from GitHub for the purpose of training NLP-based ML models. Our dataset includes 19 683 CUDA kernels focused on linear algebra. In addition to the CUDA codes, our LS-CAT dataset contains 5 028 536 associated runtimes, with different combinations of kernels, block sizes and matrix sizes. The runtime are GPU benchmarks on both Nvidia GTX 980 and Nvidia T4 systems. This information creates a foundation upon which NLP-based models can find correlations between source-code features and optimal choice of thread block sizes. There are several results that ...

Extracted data

We use cookies to provide a better user experience.

Data Protection

LS-CAT: A Large-Scale CUDA AutoTuning Dataset

Abstract

Extracted data

LS-CAT: A Large-Scale CUDA AutoTuning Dataset

Abstract

Extracted data

Related items

Related items