The road towards Exascale Computing requires a holistic effort to address three different challenges simultaneously: high performance, energy efficiency, and programmability. The use of runtime task schedulers to orchestrate parallel executions with minimal developer intervention has been introduced in recent years to tackle the programmability issue while maintaining, or even improving, performance. In this paper, we enhance the SuperMatrix runtime task scheduler integrated in the libflame library in two different directions that address high performance and energy efficiency. First, we extend the runtime by accommodating hybrid parallel executions and managing task priorities for dense linear algebra operations, with remarkable performanc...
The OmpSs programming model supports task-based parallelism in a similar manner to OpenMP. This whit...
8th WORKSHOP ON APPLICATIONS FOR MULTI-CORE ARCHITECTURESInternational audienceIn this paper, we ana...
On the road to exascale computing, the gap between hardware peak performance and application perform...
The road towards Exascale Computing requires a holistic effort to address three different challenges...
The emergence of new manycore architectures, such as the Intel Xeon Phi, poses new challenges in how...
This paper addresses the efficient exploitation of task-level parallelism, present in many dense lin...
[EN] This paper analyzes the impact on power con- sumption of two DVFS-control strategies when appli...
This paper presents a dynamic task scheduling approach to executing dense linear algebra algorithms ...
This paper addresses the efficient explotation of task-level parallelism, present in many dense lin...
International audienceThe ever-increasing supercomputer architectural complexity emphasizes the need...
Dealing with asymmetry in the architecture opens a plethora of questions related with the performan...
textThe management of power consumption while simultaneously delivering acceptable levels of perfor...
Recent accelerators such as GPUs achieve better cost-performance and watt-performance ratio, while t...
We present the use of a hybrid static/dynamic scheduling strategy of the task dependency graph for d...
Being on the verge of exascale performance has shifted the prioritization of performance in applicat...
The OmpSs programming model supports task-based parallelism in a similar manner to OpenMP. This whit...
8th WORKSHOP ON APPLICATIONS FOR MULTI-CORE ARCHITECTURESInternational audienceIn this paper, we ana...
On the road to exascale computing, the gap between hardware peak performance and application perform...
The road towards Exascale Computing requires a holistic effort to address three different challenges...
The emergence of new manycore architectures, such as the Intel Xeon Phi, poses new challenges in how...
This paper addresses the efficient exploitation of task-level parallelism, present in many dense lin...
[EN] This paper analyzes the impact on power con- sumption of two DVFS-control strategies when appli...
This paper presents a dynamic task scheduling approach to executing dense linear algebra algorithms ...
This paper addresses the efficient explotation of task-level parallelism, present in many dense lin...
International audienceThe ever-increasing supercomputer architectural complexity emphasizes the need...
Dealing with asymmetry in the architecture opens a plethora of questions related with the performan...
textThe management of power consumption while simultaneously delivering acceptable levels of perfor...
Recent accelerators such as GPUs achieve better cost-performance and watt-performance ratio, while t...
We present the use of a hybrid static/dynamic scheduling strategy of the task dependency graph for d...
Being on the verge of exascale performance has shifted the prioritization of performance in applicat...
The OmpSs programming model supports task-based parallelism in a similar manner to OpenMP. This whit...
8th WORKSHOP ON APPLICATIONS FOR MULTI-CORE ARCHITECTURESInternational audienceIn this paper, we ana...
On the road to exascale computing, the gap between hardware peak performance and application perform...