Languages for efficient parallel programming need to achieve high performance portability in order to harness the power offered by rapidly evolving parallel architectures. We use a combination of high-level architecture-aware cost modelling with a low-level, explicit control of coordination as a programming model to improve performance portability. We explore and quantify the impact of heterogeneity in modern parallel architectures on the performance of parallel programs on a range of clusters of multi-cores, varying in architectural parameters such as processor speed, memory size and interconnection speed. Additionally, we develop several formal cost models and automatically use these architectural characteristics to determine suitable...
Good locality is critical for the scalability of parallel computations. Many cost models that quanti...
Modern microprocessor architectures have gradually incorporated support for parallelism. In the past...
We propose a new model for parallel speedup that is based on two parameters, the average parallelism...
Languages for efficient parallel programming need to achieve high performance portability in order to...
Abstract: Languages for efficient parallel programming need to achieve high per-formance portability...
We survey parallel programming models and languages using six criteria to assess their suitability ...
<p>The design of microprocessor technology has hit several "walls" in recent decades. These limits o...
Institute for Computing Systems ArchitectureProgramming parallel computers remains a difficult task....
my own. Where information has been derived from other sources, I confirm that this has been indicate...
In parallel programming, the need to manage communication, load imbalance, and irregular-ities in th...
Combining easy-to-use parallelism, portability and efficiency is a very hard task when traditional p...
This work describes my solution to the performance portability problem: between CPUs and GPUs in par...
As the complexity of parallel computers grows, constraints posed by the construction of larger syste...
We evaluate the impact of programming language features on the performance of parallel applications...
The European FP7 project PEPPHER is addressing programmability and performance portability for curre...
Good locality is critical for the scalability of parallel computations. Many cost models that quanti...
Modern microprocessor architectures have gradually incorporated support for parallelism. In the past...
We propose a new model for parallel speedup that is based on two parameters, the average parallelism...
Languages for efficient parallel programming need to achieve high performance portability in order to...
Abstract: Languages for efficient parallel programming need to achieve high per-formance portability...
We survey parallel programming models and languages using six criteria to assess their suitability ...
<p>The design of microprocessor technology has hit several "walls" in recent decades. These limits o...
Institute for Computing Systems ArchitectureProgramming parallel computers remains a difficult task....
my own. Where information has been derived from other sources, I confirm that this has been indicate...
In parallel programming, the need to manage communication, load imbalance, and irregular-ities in th...
Combining easy-to-use parallelism, portability and efficiency is a very hard task when traditional p...
This work describes my solution to the performance portability problem: between CPUs and GPUs in par...
As the complexity of parallel computers grows, constraints posed by the construction of larger syste...
We evaluate the impact of programming language features on the performance of parallel applications...
The European FP7 project PEPPHER is addressing programmability and performance portability for curre...
Good locality is critical for the scalability of parallel computations. Many cost models that quanti...
Modern microprocessor architectures have gradually incorporated support for parallelism. In the past...
We propose a new model for parallel speedup that is based on two parameters, the average parallelism...