Computational science has benefited in the last years from emerging accelerators that increase the performance of scientific simulations, but using these devices hinders the programming task. This paper presents AMA: a set of optimization techniques to efficiently manage multi-accelerator systems. AMA maximizes the overlap of computation and communication in a blocking-free way. Then, we can use such spare time to do other work while waiting for device operations. Implemented on top of a task-based framework, the experimental evaluation of AMA on a quad-GPU node shows that we reach the performance of a hand-tuned native CUDA code, with the advantage of fully hiding the device management. In addition, we obtain up to more than 2x performance...
This work studies programmability enhancing abstractions in the context of accelerators and heteroge...
Heterogeneous supercomputers that incorporate computational accelerators such as GPUs are increasing...
Producción CientíficaCurrent HPC clusters are composed by several machines with different computatio...
AbstractComputational science has benefited in the last years from emerging accelerators that increa...
Computational science has benefited in the last years from emerging accelerators that increase the p...
During the past decade, accelerators, such as NVIDIA CUDA GPUs and Intel Xeon Phis, have seen an inc...
Modern high-performance computers engage a variety of computing devices. Underutilization and oversu...
Heterogeneous parallel computing combines general purpose processors with accelerators to efficientl...
Accelerators, such as GPUs and Intel Xeon Phis, have become the workhorses of high-performance compu...
Computational demands are continuously increasing, driven by the growing resource demands of applica...
Accelerators, such as GPUs and Intel Xeon Phis, have become the workhorses of high-performance compu...
As technology scaling slows down and only provides diminishing improvements in general-purpose proce...
Modern computing systems comprise heterogeneous designs which combine multiple and diverse architec...
There is a clear trend nowadays to use heterogeneous high-performance computers, as they offer consi...
Hardware accelerators have become permanent features in the post-Dennard computing landscape, displa...
This work studies programmability enhancing abstractions in the context of accelerators and heteroge...
Heterogeneous supercomputers that incorporate computational accelerators such as GPUs are increasing...
Producción CientíficaCurrent HPC clusters are composed by several machines with different computatio...
AbstractComputational science has benefited in the last years from emerging accelerators that increa...
Computational science has benefited in the last years from emerging accelerators that increase the p...
During the past decade, accelerators, such as NVIDIA CUDA GPUs and Intel Xeon Phis, have seen an inc...
Modern high-performance computers engage a variety of computing devices. Underutilization and oversu...
Heterogeneous parallel computing combines general purpose processors with accelerators to efficientl...
Accelerators, such as GPUs and Intel Xeon Phis, have become the workhorses of high-performance compu...
Computational demands are continuously increasing, driven by the growing resource demands of applica...
Accelerators, such as GPUs and Intel Xeon Phis, have become the workhorses of high-performance compu...
As technology scaling slows down and only provides diminishing improvements in general-purpose proce...
Modern computing systems comprise heterogeneous designs which combine multiple and diverse architec...
There is a clear trend nowadays to use heterogeneous high-performance computers, as they offer consi...
Hardware accelerators have become permanent features in the post-Dennard computing landscape, displa...
This work studies programmability enhancing abstractions in the context of accelerators and heteroge...
Heterogeneous supercomputers that incorporate computational accelerators such as GPUs are increasing...
Producción CientíficaCurrent HPC clusters are composed by several machines with different computatio...