Large-scale Non-Uniform Memory Access (NUMA) multiprocessors are gaining increased attention due to their potential for achieving high performance through the replication of relatively simple components. Because of the complexity of such systems, scheduling algorithms for parallel applications are crucial in realizing the performance potential of these systems. In particular, scheduling methods must consider the scale of the system, with the increased likelihood of creating bottlenecks, along with the NUMA characteristics of the system, and the benefits to be gained by placing threads close to their code and data. We propose a class of scheduling algorithms based on processor pools. A processor pool is a software construct for organizing an...
The recent addition of task parallelism to the OpenMP shared memory API allows programmers to expres...
Processors with multiple sockets or chiplets are becoming more conventional. These kinds of processo...
Abstract. The nano-threads programming model was proposed to effectively integrate multiprogramming ...
Abstract. In this paper we describe the design, implementation and experimental evaluation of a tech...
The invention, acceptance, and proliferation of multiprocessors are primarily a result of the quest ...
Parallel data processing and parallel streaming systems become quite popular. They are employed in v...
There has been much work in NUMA-aware (Non-Uniform Memory Access) scheduling the past decade, all a...
© 2017 ACM. As the number of cores increases in a single chip processor, several challenges arise: w...
International audienceOver the past few years, parallel sparse direct solvers made significant progr...
Modern computing platforms are based on multi-processor/multi-core technology. This allows running a...
Performance degradation due to nonuniform data access latencies has worsened on NUMA systems and can...
International audienceWe present a joint scheduling and memory allocation algorithm for efficient ex...
For systems with multicore processors contention for shared resources is a problem that occurs when ...
Nonuniform memory access time (referred to as NUMA) is an important feature in the design of large s...
This paper presents some techniques for efficient thread forking and joining in parallel execution e...
The recent addition of task parallelism to the OpenMP shared memory API allows programmers to expres...
Processors with multiple sockets or chiplets are becoming more conventional. These kinds of processo...
Abstract. The nano-threads programming model was proposed to effectively integrate multiprogramming ...
Abstract. In this paper we describe the design, implementation and experimental evaluation of a tech...
The invention, acceptance, and proliferation of multiprocessors are primarily a result of the quest ...
Parallel data processing and parallel streaming systems become quite popular. They are employed in v...
There has been much work in NUMA-aware (Non-Uniform Memory Access) scheduling the past decade, all a...
© 2017 ACM. As the number of cores increases in a single chip processor, several challenges arise: w...
International audienceOver the past few years, parallel sparse direct solvers made significant progr...
Modern computing platforms are based on multi-processor/multi-core technology. This allows running a...
Performance degradation due to nonuniform data access latencies has worsened on NUMA systems and can...
International audienceWe present a joint scheduling and memory allocation algorithm for efficient ex...
For systems with multicore processors contention for shared resources is a problem that occurs when ...
Nonuniform memory access time (referred to as NUMA) is an important feature in the design of large s...
This paper presents some techniques for efficient thread forking and joining in parallel execution e...
The recent addition of task parallelism to the OpenMP shared memory API allows programmers to expres...
Processors with multiple sockets or chiplets are becoming more conventional. These kinds of processo...
Abstract. The nano-threads programming model was proposed to effectively integrate multiprogramming ...