Each new generation of GPUs vastly increases the resources avail-able to GPGPU programs. GPU programming models (like CUDA) were designed to scale to use these resources. However, we find that CUDA programs actually do not scale to utilize all available resources, with over 30 % of resources going unused on average for programs of the Parboil2 suite that we used in our work. Current GPUs therefore allow concurrent execution of kernels to improve utilization. In this work, we study concurrent execution of GPU kernels using multiprogram workloads on current NVIDIA Fermi GPUs. On two-program workloads from the Parboil2 benchmark suite we find concurrent execution is often no better than serial-ized execution. We identify that the lack of contr...
Concurrency is pervasive and perplexing, particularly on graphics processing units (GPUs). Current s...
Current generation GPUs can accelerate high-performance, compute-intensive applications by exploitin...
Abstract—Graphics processors, or GPUs, have recently been widely used as accelerators in shared envi...
Each new generation of GPUs vastly increases the resources available to GPGPU programs. GPU programm...
As the complexity of applications continues to grow, each new generation of GPUs has been equipped w...
Modern graphic processing units (GPU) are powerful parallel processing multi-core devices that are f...
Co-executing GPU kernels on a partitioned GPU has been shown to improve utilization efficiency of po...
The unrivaled computing capabilities of modern GPUs meet the demand of processing massive amounts of...
Execution of GPGPU workloads consists of different stages including data I/O on the CPU, memory copy...
Graphics Processing Units (GPU) have been widely adopted to accelerate the execution of HPC workload...
GPUs are being increasingly adopted as compute accelerators in many domains, spanning environments f...
Heterogeneous computing nodes are now pervasive throughout computing, and GPUs have emerged as a lea...
Concurrency is pervasive and perplexing, particularly on graphics processing units (GPUs). Current s...
Abstract—GPUs have gained tremendous popularity in a broad range of application domains. These appli...
As modern GPU workloads become larger and more complex, there is an ever-increasing demand for GPU c...
Concurrency is pervasive and perplexing, particularly on graphics processing units (GPUs). Current s...
Current generation GPUs can accelerate high-performance, compute-intensive applications by exploitin...
Abstract—Graphics processors, or GPUs, have recently been widely used as accelerators in shared envi...
Each new generation of GPUs vastly increases the resources available to GPGPU programs. GPU programm...
As the complexity of applications continues to grow, each new generation of GPUs has been equipped w...
Modern graphic processing units (GPU) are powerful parallel processing multi-core devices that are f...
Co-executing GPU kernels on a partitioned GPU has been shown to improve utilization efficiency of po...
The unrivaled computing capabilities of modern GPUs meet the demand of processing massive amounts of...
Execution of GPGPU workloads consists of different stages including data I/O on the CPU, memory copy...
Graphics Processing Units (GPU) have been widely adopted to accelerate the execution of HPC workload...
GPUs are being increasingly adopted as compute accelerators in many domains, spanning environments f...
Heterogeneous computing nodes are now pervasive throughout computing, and GPUs have emerged as a lea...
Concurrency is pervasive and perplexing, particularly on graphics processing units (GPUs). Current s...
Abstract—GPUs have gained tremendous popularity in a broad range of application domains. These appli...
As modern GPU workloads become larger and more complex, there is an ever-increasing demand for GPU c...
Concurrency is pervasive and perplexing, particularly on graphics processing units (GPUs). Current s...
Current generation GPUs can accelerate high-performance, compute-intensive applications by exploitin...
Abstract—Graphics processors, or GPUs, have recently been widely used as accelerators in shared envi...