Computers have become increasingly complex with the emergence of heterogeneous hardware combining multicore CPUs and GPUs. These parallel systems exhibit tremendous computational power at the cost of increased programming effort resulting in a tension between performance and code portability. Typically, code is either tuned in a low-level imperative language using hardware-specific optimizations to achieve maximum performance or is written in a high-level, possibly functional, language to achieve portability at the expense of performance. We propose a novel approach aiming to combine high-level programming, code portability, and high-performance. Starting from a high-level functional expression we apply a simple set of rewrite rul...
Optimizing programs to run efficiently on modern parallel hardware is hard but crucial for many appl...
Initially driven by a strong need for increased computational performance in science and engineerin...
The relentless demands for improvements in the compute throughput, and energy efficiency have driven...
Computers have become increasingly complex with the emergence of heterogeneous hardware combining mu...
Computing systems have become increasingly complex with the emergence of heterogeneous hardware comb...
Graphics Processing Units (GPUs) are now commonplace in computing systems and are the most successf...
Parallel accelerators such as GPUs are notoriously hard to program; exploiting their full pe...
This work describes my solution to the performance portability problem: between CPUs and GPUs in par...
General-purpose GPU-based systems are highly attractive, as they give potentially massive performanc...
Parallel patterns (e.g., map, reduce) have gained traction as an abstraction for targeting parallel ...
The problem of automatically generating hardware modules from high level application representations...
The rising pressure to simultaneously improve performance and reduce power consumption is driving mo...
International audienceManycore architectures are now available in a wide range of HPC systems. Going...
As an open, royalty-free framework for writing programs that execute across heterogeneous platforms,...
Application development for modern high-performance systems with many cores, i.e., comprising multip...
Optimizing programs to run efficiently on modern parallel hardware is hard but crucial for many appl...
Initially driven by a strong need for increased computational performance in science and engineerin...
The relentless demands for improvements in the compute throughput, and energy efficiency have driven...
Computers have become increasingly complex with the emergence of heterogeneous hardware combining mu...
Computing systems have become increasingly complex with the emergence of heterogeneous hardware comb...
Graphics Processing Units (GPUs) are now commonplace in computing systems and are the most successf...
Parallel accelerators such as GPUs are notoriously hard to program; exploiting their full pe...
This work describes my solution to the performance portability problem: between CPUs and GPUs in par...
General-purpose GPU-based systems are highly attractive, as they give potentially massive performanc...
Parallel patterns (e.g., map, reduce) have gained traction as an abstraction for targeting parallel ...
The problem of automatically generating hardware modules from high level application representations...
The rising pressure to simultaneously improve performance and reduce power consumption is driving mo...
International audienceManycore architectures are now available in a wide range of HPC systems. Going...
As an open, royalty-free framework for writing programs that execute across heterogeneous platforms,...
Application development for modern high-performance systems with many cores, i.e., comprising multip...
Optimizing programs to run efficiently on modern parallel hardware is hard but crucial for many appl...
Initially driven by a strong need for increased computational performance in science and engineerin...
The relentless demands for improvements in the compute throughput, and energy efficiency have driven...