High performance Computing is increasingly being done on parallel machines like GPUs. In my work, I deal with 2 major kinds of optimizations: block size tuning and mixed precision tuning. Block size tuning involves selecting an optimal block size for CUDA kernels, where threads of execution are grouped into blocks. Earlier techniques for this involve running Autotuning, which involves multiple kernel executions; and Nvidia\u27s Occupancy Calculator, which gives multiple possible solutions, none of which might be the actual optimal. My technique uses an SVR based on static kernel features as well as dynamic features to predict an optimal block size. This is then evaluated for 89 kernels from 10 different applications. The second optimization...
Abstract — GPU based on CUDA Architecture developed by NVIDIA is a high performance computing device...
GPUs have become popular due to their high computational power. Data scientists rely on GPUs to proc...
GPUs are an increasingly popular implementation platform for a variety of general purpose applicatio...
Recent years have witnessed phenomenal growth in the application, and capabilities of Graphical Proc...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
GPUs have been used for years in compute intensive applications. Their massive parallel processing c...
Graphics Processing Units (GPUs) have revolutionized the computing landscape in the past decade and ...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
We have developed several autotuning benchmarks in CUDA that take into account performance-relevant ...
Optimal performance is an important goal in compute intensive applications. For GPU applications, th...
This paper presents a novel optimizing compiler for general purpose computation on graphics processi...
This paper presents a novel optimizing compiler for general purpose computation on graphics processi...
GPGPU Computing using CUDA is rapidly gaining ground today. GPGPU has been brought to the masses thr...
2012-05-02Graphics Processing Units (GPUs) have evolved to devices with teraflop-level performance p...
Abstract — GPU based on CUDA Architecture developed by NVIDIA is a high performance computing device...
GPUs have become popular due to their high computational power. Data scientists rely on GPUs to proc...
GPUs are an increasingly popular implementation platform for a variety of general purpose applicatio...
Recent years have witnessed phenomenal growth in the application, and capabilities of Graphical Proc...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
GPUs have been used for years in compute intensive applications. Their massive parallel processing c...
Graphics Processing Units (GPUs) have revolutionized the computing landscape in the past decade and ...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
We have developed several autotuning benchmarks in CUDA that take into account performance-relevant ...
Optimal performance is an important goal in compute intensive applications. For GPU applications, th...
This paper presents a novel optimizing compiler for general purpose computation on graphics processi...
This paper presents a novel optimizing compiler for general purpose computation on graphics processi...
GPGPU Computing using CUDA is rapidly gaining ground today. GPGPU has been brought to the masses thr...
2012-05-02Graphics Processing Units (GPUs) have evolved to devices with teraflop-level performance p...
Abstract — GPU based on CUDA Architecture developed by NVIDIA is a high performance computing device...
GPUs have become popular due to their high computational power. Data scientists rely on GPUs to proc...
GPUs are an increasingly popular implementation platform for a variety of general purpose applicatio...