Modern architectures have multiple processors, each of which contains multiple cores, connected to dedicated memory blocks, featuring NUMA (Non-Uniform Memory Access) architectures. NUMA have different latencies in accessing different blocks of distributed memory. A major challenge is to develop an efficient scheduling to the tasks produced by parallel applications among the available processors by considering the heterogeneity of these memory access times. For this, the scheduler must make decisions influenced by several factors. One of these factors is related to issues of data locality, the decision of allocate a task on a specific processor is an assessment of the costs associated to the access to its data on the asymmetric memory struc...
A MPAHA (Model for Parallel Algorithms on Heterogeneous Architectures) model that allows predicting ...
This paper introduces a learning-based framework for dynamic placement of threads of parallel applic...
For systems with multicore processors contention for shared resources is a problem that occurs when ...
International audienceThe recent addition of data dependencies to the OpenMP 4.0 standard provides t...
There has been much work in NUMA-aware (Non-Uniform Memory Access) scheduling the past decade, all a...
International audienceWe present a joint scheduling and memory allocation algorithm for efficient ex...
Within the last decade, microprocessor development reached a point at which higher clock rates and m...
Nowadays the evolution of High Performance Computing follows the needs of numerical simulations.Thes...
International audienceDynamic task-parallel programming models are popular on shared-memory systems,...
International audienceOver the past few years, parallel sparse direct solvers made significant progr...
Parallel data processing and parallel streaming systems become quite popular. They are employed in v...
\ua9 2017 ACM. As the number of cores increases in a single chip processor, several challenges arise...
The complexity of shared memory systems is becoming more relevant as the number of memory domains in...
International audienceExploiting the full computational power of current hierarchical multiprocessor...
Funding: This work has been supported by the European Union grant EU H2020-ICT-2014-1 project RePhra...
A MPAHA (Model for Parallel Algorithms on Heterogeneous Architectures) model that allows predicting ...
This paper introduces a learning-based framework for dynamic placement of threads of parallel applic...
For systems with multicore processors contention for shared resources is a problem that occurs when ...
International audienceThe recent addition of data dependencies to the OpenMP 4.0 standard provides t...
There has been much work in NUMA-aware (Non-Uniform Memory Access) scheduling the past decade, all a...
International audienceWe present a joint scheduling and memory allocation algorithm for efficient ex...
Within the last decade, microprocessor development reached a point at which higher clock rates and m...
Nowadays the evolution of High Performance Computing follows the needs of numerical simulations.Thes...
International audienceDynamic task-parallel programming models are popular on shared-memory systems,...
International audienceOver the past few years, parallel sparse direct solvers made significant progr...
Parallel data processing and parallel streaming systems become quite popular. They are employed in v...
\ua9 2017 ACM. As the number of cores increases in a single chip processor, several challenges arise...
The complexity of shared memory systems is becoming more relevant as the number of memory domains in...
International audienceExploiting the full computational power of current hierarchical multiprocessor...
Funding: This work has been supported by the European Union grant EU H2020-ICT-2014-1 project RePhra...
A MPAHA (Model for Parallel Algorithms on Heterogeneous Architectures) model that allows predicting ...
This paper introduces a learning-based framework for dynamic placement of threads of parallel applic...
For systems with multicore processors contention for shared resources is a problem that occurs when ...