Compute intensive applications running on clusters of shared-memory computers are typically implemented using OpenMP and MPI. Applications are difficult to pro-gram, debug and maintain and performance portability is usually limited. Several program transformations have be to be applied at multiple levels of the software and hardware stack to expose parallelism, choose the adequate granularity and finally map it onto the target system in order to obtain high-performing code. This thesis focuses on different aspects of the task parallel data-flow model. We address performance issues at each of the three main levels of the cluster: single core level, single-node level and multi-node level. At the single core level, we effectively target a hard...
Accelerators, such as GPUs and Intel Xeon Phis, have become the workhorses of high-performance compu...
Computer architecture has looming challenges with finding program parallelism, process technology li...
International audienceHeterogeneous supercomputers with GPUs are one of the best candidates to buil...
It has become common knowledge that parallel programming is needed for scientific applications, part...
As the demand increases for high performance and power efficiency in modern computer runtime systems...
Task-parallel languages are increasingly popular. Many of them provide expressive mechanisms for int...
Parallel task-based programming models like OpenMP support the declaration of task data dependences....
For a wide variety of applications, both task and data parallelism must be exploited to achieve the ...
With the introduction of more powerful and massively parallel embedded processors, embedded systems ...
Modern parallel programming models perform their best under the particular patterns they are tuned t...
MPI is the predominant model for parallel programming in technical high performance computing. With ...
Applications are increasingly being executed on computational systems that have hierarchical paralle...
Distributed Memory Multicomputers (DMMs) such as the IBM SP-2, the Intel Paragon and the Thinking Ma...
Abstract- Twenty-first century parallel programming models are becoming real complex due to the dive...
Nowadays, parallel computers have become ubiquitous and currentprocessors contain several execution ...
Accelerators, such as GPUs and Intel Xeon Phis, have become the workhorses of high-performance compu...
Computer architecture has looming challenges with finding program parallelism, process technology li...
International audienceHeterogeneous supercomputers with GPUs are one of the best candidates to buil...
It has become common knowledge that parallel programming is needed for scientific applications, part...
As the demand increases for high performance and power efficiency in modern computer runtime systems...
Task-parallel languages are increasingly popular. Many of them provide expressive mechanisms for int...
Parallel task-based programming models like OpenMP support the declaration of task data dependences....
For a wide variety of applications, both task and data parallelism must be exploited to achieve the ...
With the introduction of more powerful and massively parallel embedded processors, embedded systems ...
Modern parallel programming models perform their best under the particular patterns they are tuned t...
MPI is the predominant model for parallel programming in technical high performance computing. With ...
Applications are increasingly being executed on computational systems that have hierarchical paralle...
Distributed Memory Multicomputers (DMMs) such as the IBM SP-2, the Intel Paragon and the Thinking Ma...
Abstract- Twenty-first century parallel programming models are becoming real complex due to the dive...
Nowadays, parallel computers have become ubiquitous and currentprocessors contain several execution ...
Accelerators, such as GPUs and Intel Xeon Phis, have become the workhorses of high-performance compu...
Computer architecture has looming challenges with finding program parallelism, process technology li...
International audienceHeterogeneous supercomputers with GPUs are one of the best candidates to buil...