Measuring the Impact of Configuration Parameters in CUDA Through Benchmarking

Yuri Torres
Arturo Gonzalez-escribano
Diego R. Llanos

Publication date

August 2016

Abstract

The threadblock size and shape choice is one of the most important user decisions when a parallel problem is coded to run in GPU architectures. In fact, threadblock configuration has a significant impact on the global performance of the program. Un-fortunately, the programmer has not enough information about the subtle interactions between this choice of parameters and the underlying hardware. This paper presents uBench, a suite of micro-benchmarks, in order to explore the impact on performance derived from the combination of (1) the threadblock size and shape choice criteria, and (2) the GPU hardware resources and configurations. Each micro-benchmark has been designed as simple as possible to focus on a single effect derived from the hardw...