With discrete Intel GPUs entering the high-performance computing landscape, there is an urgent need for production-ready software stacks for these platforms. In this article, we report how we enable the Ginkgo math library to execute on Intel GPUs by developing a kernel backed based on the DPC++ programming environment. We discuss conceptual differences between the CUDA and DPC++ programming models and describe workflows for simplified code conversion. We evaluate the performance of basic and advanced sparse linear algebra routines available in Ginkgo\u27s DPC++ backend in the hardware-specific performance bounds and compare against routines providing the same functionality that ship with Intel\u27s oneMKL vendor library
Over the past few years, we have seen an exponential performance boost of the graphics processing un...
As an open, royalty-free framework for writing programs that execute across heterogeneous platforms,...
We analyze the efficiency of servers equipped with state-of-the-art general-purpose multicore proces...
Ginkgo is a production-ready sparse linear algebra library for high performance computing on GPU-cen...
The Intel DPC++ Compatibility Tool is a component of the Intel oneAPI Base Toolkit. This tool automa...
The relentless demands for improvements in the compute throughput, and energy efficiency have driven...
High performance computing is a topic that has risen to the top in the era ofdigitalization, AI and ...
© ACM, YYYY. This is the author's version of the work "Anzt, H., Cojean, T., Flegar, G., Göbel, F., ...
Graphics Processing Units (GPUs) have become a key technology for accelerating node performance in s...
Source code portability is becoming increasingly important in the development of new solutions in HP...
The proliferation of accelerators, in particular GPUs, over the past decade is im- pacting the way s...
With processor clock speeds having stagnated, parallel computing architectures have achieved a break...
To face the programming challenges related to heterogeneous computing, Intel recently introduced one...
Graphic processors are becoming faster and faster. Computational power within graphic processing uni...
The recent dramatic progress in machine learning is partially attributed to the availability of high...
Over the past few years, we have seen an exponential performance boost of the graphics processing un...
As an open, royalty-free framework for writing programs that execute across heterogeneous platforms,...
We analyze the efficiency of servers equipped with state-of-the-art general-purpose multicore proces...
Ginkgo is a production-ready sparse linear algebra library for high performance computing on GPU-cen...
The Intel DPC++ Compatibility Tool is a component of the Intel oneAPI Base Toolkit. This tool automa...
The relentless demands for improvements in the compute throughput, and energy efficiency have driven...
High performance computing is a topic that has risen to the top in the era ofdigitalization, AI and ...
© ACM, YYYY. This is the author's version of the work "Anzt, H., Cojean, T., Flegar, G., Göbel, F., ...
Graphics Processing Units (GPUs) have become a key technology for accelerating node performance in s...
Source code portability is becoming increasingly important in the development of new solutions in HP...
The proliferation of accelerators, in particular GPUs, over the past decade is im- pacting the way s...
With processor clock speeds having stagnated, parallel computing architectures have achieved a break...
To face the programming challenges related to heterogeneous computing, Intel recently introduced one...
Graphic processors are becoming faster and faster. Computational power within graphic processing uni...
The recent dramatic progress in machine learning is partially attributed to the availability of high...
Over the past few years, we have seen an exponential performance boost of the graphics processing un...
As an open, royalty-free framework for writing programs that execute across heterogeneous platforms,...
We analyze the efficiency of servers equipped with state-of-the-art general-purpose multicore proces...