Efficient NAS parallel benchmark kernels with CUDA

De Araujo G. A.
Griebler D.
Danelutto M.
Fernandes L. G.

Open link

Publication date

January 2020

DOI

10.1109/PDP50117.2020.00009

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Abstract

NAS Parallel Benchmarks (NPB) are one of the standard benchmark suites used to evaluate parallel hardware and software. There are many research efforts trying to provide different parallel versions apart from the original OpenMP and MPI. Concerning GPU accelerators, there are only the OpenCL and OpenACC available as consolidated versions. Our goal is to provide an efficient parallel implementation of the five NPB kernels with CUDA. Our contribution covers different aspects. First, best parallel programming practices were followed to implement NPB kernels using CUDA. Second, the support of larger workloads (class B and C) allow to stress and investigate the memory of robust GPUs. Third, we show that it is possible to make NPB efficient and s...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Efficient NAS parallel benchmark kernels with CUDA

Abstract

Extracted data

Efficient NAS parallel benchmark kernels with CUDA

Abstract

Extracted data

Related items

Related items