The high performance computing landscape is shifting from collections of homogeneous nodes towards heterogeneous systems, in which nodes consist of a combination of traditional out-of-order execution cores and accelerator devices. Accelerators, built around GPUs, many-core chips, FPGAs or DSPs, are used to offload compute-intensive tasks. The advent of this type of systems has brought about a wide and diverse ecosystem of development platforms, optimization tools and performance analysis frameworks. This is a review of the state-of-the-art in performance tools for heterogeneous computing, focusing on the most popular families of accelerators: GPUs and Intel's Xeon Phi. We describe current heterogeneous systems and the development frameworks...
Tuning the performance of applications requires understanding the interactions between code and targ...
In this paper, a set of micro-benchmarks is proposed to determine basic performance parameters of si...
Recent trends in computing architecture development have focused on exploiting task- and data-level ...
High performance computing platform is moving from homogeneous individual unites to heterogeneous sy...
Intel's Xeon Phi combines the parallel processing power of a many-core accelerator with the programm...
Accelerators, such as GPUs and Intel Xeon Phis, have become the workhorses of high-performance compu...
Advanced accelerator simulations have played a prominent role in the design and analysis of modern a...
High energy colliders are essential to study the inner struc-ture of nuclear and elementary particle...
In recent years the designs of High Performance Computing (HPC) clusters have become more complex. T...
In recent years the designs of High Performance Computing (HPC) clusters have become more complex. T...
Emerging computer architectures and advanced computing technologies, such as Intel’s Many Integrated...
The next-generation of supercomputers will feature a diverse mix of accelerator devices. The increas...
In this paper, a set of micro-benchmarks is proposed to determine basic performance parameters of si...
<p>The design of microprocessor technology has hit several "walls" in recent decades. These limits o...
The goal of reaching exascale computing is made especially challenging by the highly heterogeneous n...
Tuning the performance of applications requires understanding the interactions between code and targ...
In this paper, a set of micro-benchmarks is proposed to determine basic performance parameters of si...
Recent trends in computing architecture development have focused on exploiting task- and data-level ...
High performance computing platform is moving from homogeneous individual unites to heterogeneous sy...
Intel's Xeon Phi combines the parallel processing power of a many-core accelerator with the programm...
Accelerators, such as GPUs and Intel Xeon Phis, have become the workhorses of high-performance compu...
Advanced accelerator simulations have played a prominent role in the design and analysis of modern a...
High energy colliders are essential to study the inner struc-ture of nuclear and elementary particle...
In recent years the designs of High Performance Computing (HPC) clusters have become more complex. T...
In recent years the designs of High Performance Computing (HPC) clusters have become more complex. T...
Emerging computer architectures and advanced computing technologies, such as Intel’s Many Integrated...
The next-generation of supercomputers will feature a diverse mix of accelerator devices. The increas...
In this paper, a set of micro-benchmarks is proposed to determine basic performance parameters of si...
<p>The design of microprocessor technology has hit several "walls" in recent decades. These limits o...
The goal of reaching exascale computing is made especially challenging by the highly heterogeneous n...
Tuning the performance of applications requires understanding the interactions between code and targ...
In this paper, a set of micro-benchmarks is proposed to determine basic performance parameters of si...
Recent trends in computing architecture development have focused on exploiting task- and data-level ...