AbstractHigh-level C++ proxies for the convenient manipulation of subvectors and submatrices on OpenCL-enabled devices are introduced. It is demonstrated that the programming convenience of standard host-based code can be retained using native C++ language features only, even if massively parallel computing architectures such as graphics processing units are employed. The required modifications of the underlying OpenCL kernels are discussed and a case study of an implementation of the QR-factorization is given. Benchmark results confirm that the convenience of purely CPU-based libraries can be preserved without sacrificing performance of OpenCL-enabled devices, particularly graphics processing units
This paper reports on the development of an MPI/OpenCL implementation of LU, an application-level be...
Accelerator processors allow energy-efficient computation at high performance, especially for comput...
Hardware designers and engineers typically need to explore a multi-parametric design space in order ...
AbstractHigh-level C++ proxies for the convenient manipulation of subvectors and submatrices on Open...
As an open, royalty-free framework for writing programs that execute across heterogeneous platforms,...
International audienceManycore architectures are now available in a wide range of HPC systems. Going...
High performance parallel computing was something exclusive for expensive specialized hardware some ...
In the last decade graphics processors (GPUs) have been extensively used to solve computationally i...
OpenCL is a programming language standard which enables the programmer to express the application by...
OpenCL, a modern parallel heterogeneous system programming language, enables problems to be partitio...
Recent developments in processor architecture have settled a shift from sequential processing to par...
The problem of automatically generating hardware modules from high level application representations...
In this work, we evaluate OpenCL as a programming tool for developing performance-portable applicati...
This work describes my solution to the performance portability problem: between CPUs and GPUs in par...
OpenCL has been designed to achieve functional portability across multi-core devices from different ...
This paper reports on the development of an MPI/OpenCL implementation of LU, an application-level be...
Accelerator processors allow energy-efficient computation at high performance, especially for comput...
Hardware designers and engineers typically need to explore a multi-parametric design space in order ...
AbstractHigh-level C++ proxies for the convenient manipulation of subvectors and submatrices on Open...
As an open, royalty-free framework for writing programs that execute across heterogeneous platforms,...
International audienceManycore architectures are now available in a wide range of HPC systems. Going...
High performance parallel computing was something exclusive for expensive specialized hardware some ...
In the last decade graphics processors (GPUs) have been extensively used to solve computationally i...
OpenCL is a programming language standard which enables the programmer to express the application by...
OpenCL, a modern parallel heterogeneous system programming language, enables problems to be partitio...
Recent developments in processor architecture have settled a shift from sequential processing to par...
The problem of automatically generating hardware modules from high level application representations...
In this work, we evaluate OpenCL as a programming tool for developing performance-portable applicati...
This work describes my solution to the performance portability problem: between CPUs and GPUs in par...
OpenCL has been designed to achieve functional portability across multi-core devices from different ...
This paper reports on the development of an MPI/OpenCL implementation of LU, an application-level be...
Accelerator processors allow energy-efficient computation at high performance, especially for comput...
Hardware designers and engineers typically need to explore a multi-parametric design space in order ...