International audienceWe contribute a method to jointly use CPU and GPU in order to execute a balanced parallel code, automatically generated using polyhedral tools. To evenly distribute the load, the system is guided by predictions of loop nest execution times. Static and dynamic performance factors are modelled by two automatic and portable frameworks targeting CPUs and CUDA GPUs. The prediction methods comprise three parts: static code generation, offline profiling and online prediction. There are multiple versions of the loop nests, so that our scheduler balances the load of multiple combinations of code versions and selects the fastest before execution. This proposal is validated on the polyhedral benchmark suite, showing that CPU+GPU...
Selected for presentation at the HiPEAC 2013 Conf.International audienceThis paper addresses the com...
In parallel computing, obtaining maximal performance is often mandatory to solve large and complex p...
This work is a part of the global tendency to use modern computing systems for modeling the phase-fi...
International audienceWe contribute a method to jointly use CPU and GPU in order to execute a balanc...
Technological limitations faced by the semi-conductor manufacturers in the early 2000's restricted t...
International audienceIt is often hard to predict the performance of a statically generated code. Ha...
Technological limitations faced by the semi-conductor manufacturers in the early 2000's restricted t...
Nowadays multicores machines are becoming more and more common. Ideally, all the applications benefi...
We propose a GPU fine-grained load-balancing abstraction that decouples load balancing from work pro...
Recent advances in GPUs (graphics processing units) lead to mas-sively parallel hardware that is eas...
Maintaining computational load balance is important to the performant behavior of codes which operat...
The computational power provided by many-core graph-ics processing units (GPUs) has been exploited i...
Heterogeneous computing systems using one or more graphics processing units (GPUs) as accelerators p...
Scientific codes are usually highly parallelised and executed on heterogeneous architectures. Nowada...
The GPU-based heterogeneous architectures (e.g., Tianhe-1A, Nebulae), composing multi-core CPU and G...
Selected for presentation at the HiPEAC 2013 Conf.International audienceThis paper addresses the com...
In parallel computing, obtaining maximal performance is often mandatory to solve large and complex p...
This work is a part of the global tendency to use modern computing systems for modeling the phase-fi...
International audienceWe contribute a method to jointly use CPU and GPU in order to execute a balanc...
Technological limitations faced by the semi-conductor manufacturers in the early 2000's restricted t...
International audienceIt is often hard to predict the performance of a statically generated code. Ha...
Technological limitations faced by the semi-conductor manufacturers in the early 2000's restricted t...
Nowadays multicores machines are becoming more and more common. Ideally, all the applications benefi...
We propose a GPU fine-grained load-balancing abstraction that decouples load balancing from work pro...
Recent advances in GPUs (graphics processing units) lead to mas-sively parallel hardware that is eas...
Maintaining computational load balance is important to the performant behavior of codes which operat...
The computational power provided by many-core graph-ics processing units (GPUs) has been exploited i...
Heterogeneous computing systems using one or more graphics processing units (GPUs) as accelerators p...
Scientific codes are usually highly parallelised and executed on heterogeneous architectures. Nowada...
The GPU-based heterogeneous architectures (e.g., Tianhe-1A, Nebulae), composing multi-core CPU and G...
Selected for presentation at the HiPEAC 2013 Conf.International audienceThis paper addresses the com...
In parallel computing, obtaining maximal performance is often mandatory to solve large and complex p...
This work is a part of the global tendency to use modern computing systems for modeling the phase-fi...