© 2021 ACM.Recently, graphic processing unit (GPU) multitasking has become important in many platforms since an efficient GPU multitasking mechanism can enable more GPU-enabled tasks running on limited physical GPUs. However, current GPU multitasking technologies, such as NVIDIA Multi-Process Service (MPS) and Hyper-Q may not fully utilize GPU resources since they do not consider the efficient use of intra-GPU resources. In this paper, we present smCompactor, which is a fine-grained GPU multitasking framework to fully exploit intra-GPU resources for different workloads. smCompactor dispatches any particular thread blocks (TBs) of different GPU kernels to appropriate stream multiprocessors (SMs) based on our profiled results of workloads. Wi...
Using multi-GPU systems, including GPU clusters, is gaining popularity in scientific computing. Howe...
Each new generation of GPUs vastly increases the resources available to GPGPU programs. GPU programm...
High compute-density with massive thread-level parallelism of Graphics Processing Units (GPUs) is be...
Graphics processing units (GPUs) are increasingly adopted in modern computer systems beyond their tr...
Graphic Processing Units (GPUs) are currently widely used in High Performance Computing (HPC) applic...
This paper describes GPUSync, which is a framework for managing graphics processing units (GPUs) in ...
Graphics processing units (GPUs) feature an increasing number of streaming multiprocessors (SMs) wit...
Graphics processing units (GPUs) have become a very powerful platform embracing a concept of heterog...
In this paper, we present two conceptual frameworks for GPU applications to adjust their task execut...
GPUs are being increasingly adopted as compute accelerators in many domains, spanning environments f...
The graphics processing unit (GPU) is becoming a very powerful platform to accelerate graphics and d...
<p>The continued growth of the computational capability of throughput processors has made throughput...
We present the design and first performance and usability evaluation of GeMTC, a novel execution mod...
Map-Reduce is a framework for processing parallelizable problem across huge datasets using a large c...
Each new generation of GPUs vastly increases the resources avail-able to GPGPU programs. GPU program...
Using multi-GPU systems, including GPU clusters, is gaining popularity in scientific computing. Howe...
Each new generation of GPUs vastly increases the resources available to GPGPU programs. GPU programm...
High compute-density with massive thread-level parallelism of Graphics Processing Units (GPUs) is be...
Graphics processing units (GPUs) are increasingly adopted in modern computer systems beyond their tr...
Graphic Processing Units (GPUs) are currently widely used in High Performance Computing (HPC) applic...
This paper describes GPUSync, which is a framework for managing graphics processing units (GPUs) in ...
Graphics processing units (GPUs) feature an increasing number of streaming multiprocessors (SMs) wit...
Graphics processing units (GPUs) have become a very powerful platform embracing a concept of heterog...
In this paper, we present two conceptual frameworks for GPU applications to adjust their task execut...
GPUs are being increasingly adopted as compute accelerators in many domains, spanning environments f...
The graphics processing unit (GPU) is becoming a very powerful platform to accelerate graphics and d...
<p>The continued growth of the computational capability of throughput processors has made throughput...
We present the design and first performance and usability evaluation of GeMTC, a novel execution mod...
Map-Reduce is a framework for processing parallelizable problem across huge datasets using a large c...
Each new generation of GPUs vastly increases the resources avail-able to GPGPU programs. GPU program...
Using multi-GPU systems, including GPU clusters, is gaining popularity in scientific computing. Howe...
Each new generation of GPUs vastly increases the resources available to GPGPU programs. GPU programm...
High compute-density with massive thread-level parallelism of Graphics Processing Units (GPUs) is be...