NAS Parallel Benchmarks (NPB) is a standard benchmark suite used in the evaluation of parallel hardware and software. Several research efforts from academia have made these benchmarks available with different parallel programming models beyond the original versions with OpenMP and MPI. This work joins these research efforts by providing a new CUDA implementation for NPB. Our contribution covers different aspects beyond the implementation. First, we define design principles based on the best programming practices for GPUs and apply them to each benchmark using CUDA. Second, we provide ease of use parametrization support for configuring the number of threads per block in our version. Third, we conduct a broad study on the impact of the number...
Programming Massively Parallel Processors discusses basic concepts about parallel programming and GP...
The threadblock size and shape choice is one of the most important user decisions when a parallel pr...
Abstract—Many general-purpose applications exploit Graphics Processing Units (GPUs) by executing a s...
NAS Parallel Benchmarks (NPB) is a standard benchmark suite used in the evaluation of parallel hardw...
NAS Parallel Benchmarks (NPB) are one of the standard benchmark suites used to evaluate parallel har...
NPB Benchmark Kernels for GPU with CUDA Reference Paper Citation [DOI] Araujo, G. A. ; Griebler, D....
The NAS Parallel Benchmarks (NPB), originally implemented mostly in Fortran, is a consolidated suite...
Benchmarking is a way to study the performance of new architectures and parallel programming framewo...
Abstract — GPU based on CUDA Architecture developed by NVIDIA is a high performance computing device...
The NAS Parallel Benchmarks (NPB) are a suite of parallel computer performance benchmarks. They were...
We present the performance analysis of a port of the LU benchmark from the NAS Parallel Benchmark (N...
A new set of benchmarks was developed for the performance evaluation of highly parallel supercompute...
have emerged as a powerful accelerator for general-purpose computations. GPUs are attached to every ...
We investigate multi-level parallelism on GPU clusters with MPI-CUDA and hybrid MPI-OpenMP-CUDA para...
We describe a new problem size, called Class D, for the NAS Parallel Benchmarks (NPB), whose MPI sou...
Programming Massively Parallel Processors discusses basic concepts about parallel programming and GP...
The threadblock size and shape choice is one of the most important user decisions when a parallel pr...
Abstract—Many general-purpose applications exploit Graphics Processing Units (GPUs) by executing a s...
NAS Parallel Benchmarks (NPB) is a standard benchmark suite used in the evaluation of parallel hardw...
NAS Parallel Benchmarks (NPB) are one of the standard benchmark suites used to evaluate parallel har...
NPB Benchmark Kernels for GPU with CUDA Reference Paper Citation [DOI] Araujo, G. A. ; Griebler, D....
The NAS Parallel Benchmarks (NPB), originally implemented mostly in Fortran, is a consolidated suite...
Benchmarking is a way to study the performance of new architectures and parallel programming framewo...
Abstract — GPU based on CUDA Architecture developed by NVIDIA is a high performance computing device...
The NAS Parallel Benchmarks (NPB) are a suite of parallel computer performance benchmarks. They were...
We present the performance analysis of a port of the LU benchmark from the NAS Parallel Benchmark (N...
A new set of benchmarks was developed for the performance evaluation of highly parallel supercompute...
have emerged as a powerful accelerator for general-purpose computations. GPUs are attached to every ...
We investigate multi-level parallelism on GPU clusters with MPI-CUDA and hybrid MPI-OpenMP-CUDA para...
We describe a new problem size, called Class D, for the NAS Parallel Benchmarks (NPB), whose MPI sou...
Programming Massively Parallel Processors discusses basic concepts about parallel programming and GP...
The threadblock size and shape choice is one of the most important user decisions when a parallel pr...
Abstract—Many general-purpose applications exploit Graphics Processing Units (GPUs) by executing a s...