This paper introduces two novel algorithms for thread migrations, named CIMAR (Core-aware Interchange and Migration Algorithm with performance Record –IMAR–) and NIMAR (Node-aware IMAR), and a new algorithm for the migration of memory pages, LMMA (Latency-based Memory pages Migration Algorithm), in the context of Non-Uniform Memory Access (NUMA) systems. This kind of system has complex memory hierarchies that present a challenging problem in extracting the best possible performance, where thread and memory mapping play a critical role. The presented algorithms gather and process the information provided by hardware counters to make decisions about the migrations to be performed, trying to find the optimal mapping. They have been implemented...
International audienceModern multicore systems are based on a Non-Uniform Memory Access (NUMA) desig...
Abstract—An important aspect of workload characterization is understanding memory system performance...
A multiprocessor system with uniform memory access is difficult to scale due to the increasing conte...
Multi-core nodes with Non-Uniform Memory Access (NUMA) are now a common architecture for high perfor...
International audienceNowadays, NUMA architectures are common in compute-intensive systems. Achievin...
A common approach to improve memory access in NUMA machines exploits operating system (OS) page prot...
Processors with multiple sockets or chiplets are becoming more conventional. These kinds of processo...
Modern multicore systems are based on a Non-Uniform Memory Access (NUMA) design. In a NUMA system, c...
Current multi-socket systems have complex memory hierarchies with significant Non-Uniform Memory Acc...
The latency of memory access times is hence non-uniform, because it depends on where the request ori...
Multicore multiprocessors use Non Uniform Memory Ar-chitecture (NUMA) to improve their scalability. ...
Multicore multiprocessors use a Non Uniform Memory Architecture (NUMA) to improve their scalability....
International audienceNon Uniform Memory Access (NUMA) architectures are nowadays common for running...
International audienceExploiting the full computational power of current hierarchical multiprocessor...
Multi-core platforms with non-uniform memory access (NUMA) design are now a common resource in High ...
International audienceModern multicore systems are based on a Non-Uniform Memory Access (NUMA) desig...
Abstract—An important aspect of workload characterization is understanding memory system performance...
A multiprocessor system with uniform memory access is difficult to scale due to the increasing conte...
Multi-core nodes with Non-Uniform Memory Access (NUMA) are now a common architecture for high perfor...
International audienceNowadays, NUMA architectures are common in compute-intensive systems. Achievin...
A common approach to improve memory access in NUMA machines exploits operating system (OS) page prot...
Processors with multiple sockets or chiplets are becoming more conventional. These kinds of processo...
Modern multicore systems are based on a Non-Uniform Memory Access (NUMA) design. In a NUMA system, c...
Current multi-socket systems have complex memory hierarchies with significant Non-Uniform Memory Acc...
The latency of memory access times is hence non-uniform, because it depends on where the request ori...
Multicore multiprocessors use Non Uniform Memory Ar-chitecture (NUMA) to improve their scalability. ...
Multicore multiprocessors use a Non Uniform Memory Architecture (NUMA) to improve their scalability....
International audienceNon Uniform Memory Access (NUMA) architectures are nowadays common for running...
International audienceExploiting the full computational power of current hierarchical multiprocessor...
Multi-core platforms with non-uniform memory access (NUMA) design are now a common resource in High ...
International audienceModern multicore systems are based on a Non-Uniform Memory Access (NUMA) desig...
Abstract—An important aspect of workload characterization is understanding memory system performance...
A multiprocessor system with uniform memory access is difficult to scale due to the increasing conte...