Co-executing GPU kernels on a partitioned GPU has been shown to improve utilization efficiency of poorly scaling tasks. While kernels can be executed in parallel, data transfers to the GPU are serial which can negatively impact parallelism and predictability of the kernels.In this work we implement a fairness-based approach to memory transfers by chunking data sets and transferring them interleaved and evaluate the overhead of this approach. Then we develop a model to predict when kernels will start using this implementation. We found that chunked transfers in a single CUDA stream have onlya small overhead compared to serial transfers, while event synchronized transfers in several streams have larger overhead particularly for chunk sizes le...
Graphics processing units (GPUs) have become prevalent in modern computing systems. While their high...
GPUs are commonly used as coprocessors to accelerate a compute-intensive task, thanks to their massi...
2018-02-23Graphics Processing Units (GPUs) are designed primarily to execute multimedia, and game re...
Co-executing GPU kernels on a partitioned GPU has been shown to improve utilization efficiency of po...
The current trend in recently released Graphic Processing Units (GPUs) is to exploit transistor scal...
Each new generation of GPUs vastly increases the resources available to GPGPU programs. GPU programm...
Each new generation of GPUs vastly increases the resources avail-able to GPGPU programs. GPU program...
Abstract—Graphics processors, or GPUs, have recently been widely used as accelerators in shared envi...
As modern GPU workloads become larger and more complex, there is an ever-increasing demand for GPU c...
The CPU-GPU combination is a widely used heterogeneous computing system in which the CPU and GPU hav...
The CPU-GPU combination is a widely used heterogeneous computing system in which the CPU and GPU hav...
The CPU-GPU combination is a widely used heterogeneous computing system in which the CPU and GPU hav...
Abstract—GPUs have gained tremendous popularity in a broad range of application domains. These appli...
In order to satisfy timing constraints, modern real-time applications require massively parallel acc...
Execution of GPGPU workloads consists of different stages including data I/O on the CPU, memory copy...
Graphics processing units (GPUs) have become prevalent in modern computing systems. While their high...
GPUs are commonly used as coprocessors to accelerate a compute-intensive task, thanks to their massi...
2018-02-23Graphics Processing Units (GPUs) are designed primarily to execute multimedia, and game re...
Co-executing GPU kernels on a partitioned GPU has been shown to improve utilization efficiency of po...
The current trend in recently released Graphic Processing Units (GPUs) is to exploit transistor scal...
Each new generation of GPUs vastly increases the resources available to GPGPU programs. GPU programm...
Each new generation of GPUs vastly increases the resources avail-able to GPGPU programs. GPU program...
Abstract—Graphics processors, or GPUs, have recently been widely used as accelerators in shared envi...
As modern GPU workloads become larger and more complex, there is an ever-increasing demand for GPU c...
The CPU-GPU combination is a widely used heterogeneous computing system in which the CPU and GPU hav...
The CPU-GPU combination is a widely used heterogeneous computing system in which the CPU and GPU hav...
The CPU-GPU combination is a widely used heterogeneous computing system in which the CPU and GPU hav...
Abstract—GPUs have gained tremendous popularity in a broad range of application domains. These appli...
In order to satisfy timing constraints, modern real-time applications require massively parallel acc...
Execution of GPGPU workloads consists of different stages including data I/O on the CPU, memory copy...
Graphics processing units (GPUs) have become prevalent in modern computing systems. While their high...
GPUs are commonly used as coprocessors to accelerate a compute-intensive task, thanks to their massi...
2018-02-23Graphics Processing Units (GPUs) are designed primarily to execute multimedia, and game re...