Enabling the CUDA Unified Memory model in Edge, Cloud and HPC offloaded GPU kernels

GPU Performance Portability Using Standard C++ and SYCL

Delaney, Hugh

February 2023

The proliferation of accelerators, in particular GPUs, over the past decade is im- pacting the way s...

Adding CUDA® Support to Cling: JIT Compile to GPUs

Ehrig, Simeon (5735981)
Naumann, Axel (5736035)
Huebl, Axel (5113949)

September 2018

We present the results of a diploma thesis adding CUDA (runtime) C++ support to cling. Today's HPC s...

Performance of CUDA Unified Memory in CMS Heterogeneous Pixel Reconstruction

Matti J. Kortelainen
Martin Kwok

August 2021

The management of separate memory spaces of CPUs and GPUs brings an additional burden to the develop...

Enabling the CUDA Unified Memory model in Edge, Cloud and HPC offloaded GPU kernels

Montella R.
D. Di Luccio
C. G. De Vita
G. Mellone
M. Lapegna
G. Laccetti
G. Giunta
S. Kosta

January 2022

The use of hardware accelerators, based on code and data offloading devoted to overcoming the CPU l...

A complete and efficient CUDA-sharing solution for HPC clusters

Peña Monferrer, Antonio J.
Reaño, Carlos
Silla, Federico
Mayo, Rafael
Quintana-Orti, Enrique S.
Duato, José

January 2014

In this paper we detail the key features, architectural design, and implementation of rCUDA, an adv...

An Investigation of Unified Memory Access Performance in CUDA

Raphael L
Tiansheng Zhang
Ayse K. Coskun
Martin Herbordt

January 2016

Abstract—Managing memory between the CPU and GPU is a major challenge in GPU computing. A programmin...

Enabling CUDA acceleration within virtual machines using rCUDA

Duato, José
Peña, Antonio J.
Silla, Federico
Fernández, Juan C.
Mayo, Rafael
Quintana-Ortí, Enrique S.

The hardware and software advances of Graphics Processing Units (GPUs) have favored the development ...

Improving GPGPU Concurrency with Elastic Kernels

Pai, Sreepathi
Thazhuthaveetil, Matthew J
Govindarajan, R

January 2013

Each new generation of GPUs vastly increases the resources available to GPGPU programs. GPU programm...

Improving GPGPU concurrency with elastic kernels

Sreepathi Pai
Matthew J. Thazhuthaveetil
R. Govindarajan

January 2013

Each new generation of GPUs vastly increases the resources avail-able to GPGPU programs. GPU program...

Efficient parallelisation techniques for applications running on GPUs using the CUDA framework

Ottesen, Alexander

January 2009

Modern graphic processing units (GPU) are powerful parallel processing multi-core devices that are f...

Implementing CUDA Unified Memory in the PyTorch Framework

Choi, Jake
Yeom, Heon Young
Kim, Yoonhee

January 2021

Improving the user experience of the rCUDA remote GPU virtualization framework

Reaño, Carlos
Silla, Federico
Castelló, Adrián
Peña Monferrer, Antonio J.
Mayo, Rafael
Quintana-Orti, Enrique S.

October 2014

Graphics processing units (GPUs) are being increasingly embraced by the high-performance computing c...

On the Virtualization of CUDA Based GPU Remoting on ARM and X86 Machines in the GVirtuS Framework

Montella, Raffaele
Giunta, Giulio
LACCETTI, GIULIANO
LAPEGNA, MARCO
Palmieri, Carlo
Ferraro, Carmine
Pelliccia, Valentina
Hong, Cheol Ho
Spence, Ivor
Nikolopoulos, Dimitrios S.

January 2017

The astonishing development of diverse and different hardware platforms is twofold: on one side, the...

On the Use of Remote GPUs and Low-Power Processors for the

October 2015

Abstract—Many current high-performance clusters include one or more GPUs per node in order to dramat...

Optimizing GPU virtualization with address mapping and delayed submission

Wang, Xiaolin
Wang, Hanbing
Sang, Yan
Wang, Zhenlin
Luo, Yingwei

March 2014

GPU Performance Portability Using Standard C++ and SYCL

Delaney, Hugh

February 2023

The proliferation of accelerators, in particular GPUs, over the past decade is im- pacting the way s...

Adding CUDA® Support to Cling: JIT Compile to GPUs

Ehrig, Simeon (5735981)
Naumann, Axel (5736035)
Huebl, Axel (5113949)

September 2018

We present the results of a diploma thesis adding CUDA (runtime) C++ support to cling. Today's HPC s...

Performance of CUDA Unified Memory in CMS Heterogeneous Pixel Reconstruction

Matti J. Kortelainen
Martin Kwok

August 2021

The management of separate memory spaces of CPUs and GPUs brings an additional burden to the develop...

Enabling the CUDA Unified Memory model in Edge, Cloud and HPC offloaded GPU kernels

Montella R.
D. Di Luccio
C. G. De Vita
G. Mellone
M. Lapegna
G. Laccetti
G. Giunta
S. Kosta

January 2022

The use of hardware accelerators, based on code and data offloading devoted to overcoming the CPU l...

A complete and efficient CUDA-sharing solution for HPC clusters

Peña Monferrer, Antonio J.
Reaño, Carlos
Silla, Federico
Mayo, Rafael
Quintana-Orti, Enrique S.
Duato, José

January 2014

In this paper we detail the key features, architectural design, and implementation of rCUDA, an adv...

An Investigation of Unified Memory Access Performance in CUDA

Raphael L
Tiansheng Zhang
Ayse K. Coskun
Martin Herbordt

January 2016

Abstract—Managing memory between the CPU and GPU is a major challenge in GPU computing. A programmin...

Enabling CUDA acceleration within virtual machines using rCUDA

Duato, José
Peña, Antonio J.
Silla, Federico
Fernández, Juan C.
Mayo, Rafael
Quintana-Ortí, Enrique S.

The hardware and software advances of Graphics Processing Units (GPUs) have favored the development ...

Improving GPGPU Concurrency with Elastic Kernels

Pai, Sreepathi
Thazhuthaveetil, Matthew J
Govindarajan, R

January 2013

Each new generation of GPUs vastly increases the resources available to GPGPU programs. GPU programm...

Improving GPGPU concurrency with elastic kernels

Sreepathi Pai
Matthew J. Thazhuthaveetil
R. Govindarajan

January 2013

Each new generation of GPUs vastly increases the resources avail-able to GPGPU programs. GPU program...

Efficient parallelisation techniques for applications running on GPUs using the CUDA framework

Ottesen, Alexander

January 2009

Modern graphic processing units (GPU) are powerful parallel processing multi-core devices that are f...

Implementing CUDA Unified Memory in the PyTorch Framework

Choi, Jake
Yeom, Heon Young
Kim, Yoonhee

January 2021

Improving the user experience of the rCUDA remote GPU virtualization framework

Reaño, Carlos
Silla, Federico
Castelló, Adrián
Peña Monferrer, Antonio J.
Mayo, Rafael
Quintana-Orti, Enrique S.

October 2014

Graphics processing units (GPUs) are being increasingly embraced by the high-performance computing c...

On the Virtualization of CUDA Based GPU Remoting on ARM and X86 Machines in the GVirtuS Framework

Montella, Raffaele
Giunta, Giulio
LACCETTI, GIULIANO
LAPEGNA, MARCO
Palmieri, Carlo
Ferraro, Carmine
Pelliccia, Valentina
Hong, Cheol Ho
Spence, Ivor
Nikolopoulos, Dimitrios S.

January 2017

The astonishing development of diverse and different hardware platforms is twofold: on one side, the...

On the Use of Remote GPUs and Low-Power Processors for the

October 2015

Abstract—Many current high-performance clusters include one or more GPUs per node in order to dramat...

Optimizing GPU virtualization with address mapping and delayed submission

Wang, Xiaolin
Wang, Hanbing
Sang, Yan
Wang, Zhenlin
Luo, Yingwei

March 2014

GPU Performance Portability Using Standard C++ and SYCL

Delaney, Hugh

February 2023

The proliferation of accelerators, in particular GPUs, over the past decade is im- pacting the way s...

Adding CUDA® Support to Cling: JIT Compile to GPUs

Ehrig, Simeon (5735981)
Naumann, Axel (5736035)
Huebl, Axel (5113949)

September 2018

We present the results of a diploma thesis adding CUDA (runtime) C++ support to cling. Today's HPC s...

Performance of CUDA Unified Memory in CMS Heterogeneous Pixel Reconstruction

Matti J. Kortelainen
Martin Kwok

August 2021

The management of separate memory spaces of CPUs and GPUs brings an additional burden to the develop...

Enabling the CUDA Unified Memory model in Edge, Cloud and HPC offloaded GPU kernels

Abstract

Extracted data

Enabling the CUDA Unified Memory model in Edge, Cloud and HPC offloaded GPU kernels

Abstract

Extracted data

Related items

Related items