The threadblock size and shape choice is one of the most important user decisions when a parallel problem is coded to run in GPU architectures. In fact, threadblock configuration has a significant impact on the global performance of the program. Un-fortunately, the programmer has not enough information about the subtle interactions between this choice of parameters and the underlying hardware. This paper presents uBench, a suite of micro-benchmarks, in order to explore the impact on performance derived from the combination of (1) the threadblock size and shape choice criteria, and (2) the GPU hardware resources and configurations. Each micro-benchmark has been designed as simple as possible to focus on a single effect derived from the hardw...
In this work we explore the performance of CUDA in quenched lattice SU(2) simulations. CUDA, NVIDIA ...
We present the performance analysis of a port of the LU benchmark from the NAS Parallel Benchmark (N...
This work analyzes the role of graphic processing units (GPUs) in the framework of traditional paral...
Abstract—The NVIDIA graphics processing units (GPUs) are playing an important role as general purpos...
Abstract — GPU based on CUDA Architecture developed by NVIDIA is a high performance computing device...
NAS Parallel Benchmarks (NPB) is a standard benchmark suite used in the evaluation of parallel hardw...
Abstract—Graphics processors (GPU) are interesting for nongraphics parallel computation because of t...
High performance Computing is increasingly being done on parallel machines like GPUs. In my work, I ...
The characteristics of graphics processing units (GPUs), especially their parallel execution capabil...
GPGPU Computing using CUDA is rapidly gaining ground today. GPGPU has been brought to the masses thr...
GPUs are widely being used to meet the ever increasing demands of High performance computing. High-e...
The significant growth in computational power of mod-ern Graphics Processing Units(GPUs) coupled wit...
We have developed several autotuning benchmarks in CUDA that take into account performance-relevant ...
Tuning GPU applications is a very challenging task as any source-code optimization can sensibly impa...
We develop a microbenchmark-based performance model for NVIDIA GeForce 200-series GPUs. Our model id...
In this work we explore the performance of CUDA in quenched lattice SU(2) simulations. CUDA, NVIDIA ...
We present the performance analysis of a port of the LU benchmark from the NAS Parallel Benchmark (N...
This work analyzes the role of graphic processing units (GPUs) in the framework of traditional paral...
Abstract—The NVIDIA graphics processing units (GPUs) are playing an important role as general purpos...
Abstract — GPU based on CUDA Architecture developed by NVIDIA is a high performance computing device...
NAS Parallel Benchmarks (NPB) is a standard benchmark suite used in the evaluation of parallel hardw...
Abstract—Graphics processors (GPU) are interesting for nongraphics parallel computation because of t...
High performance Computing is increasingly being done on parallel machines like GPUs. In my work, I ...
The characteristics of graphics processing units (GPUs), especially their parallel execution capabil...
GPGPU Computing using CUDA is rapidly gaining ground today. GPGPU has been brought to the masses thr...
GPUs are widely being used to meet the ever increasing demands of High performance computing. High-e...
The significant growth in computational power of mod-ern Graphics Processing Units(GPUs) coupled wit...
We have developed several autotuning benchmarks in CUDA that take into account performance-relevant ...
Tuning GPU applications is a very challenging task as any source-code optimization can sensibly impa...
We develop a microbenchmark-based performance model for NVIDIA GeForce 200-series GPUs. Our model id...
In this work we explore the performance of CUDA in quenched lattice SU(2) simulations. CUDA, NVIDIA ...
We present the performance analysis of a port of the LU benchmark from the NAS Parallel Benchmark (N...
This work analyzes the role of graphic processing units (GPUs) in the framework of traditional paral...