International audienceModern multicore systems are based on a Non-Uniform Memory Access (NUMA) design. Efficiently exploiting such architectures is notoriously complex for programmers. One of the key concerns is to limit as much as possible the number of remote memory accesses (i.e., main memory accesses performed from a core to a memory bank that is not directly attached to it). However, in many cases, existing profilers do not provide enough information to help programmers achieve this goal.This paper presents MemProf, a profiler that allows programmers to choose and implement efficient application-level optimizations for NUMA systems. MemProf builds temporal flows of interactions between threads and objects, which help programmers unders...
International audienceIn modern parallel architectures, memory accesses represent a common bottlenec...
As the number of cores increases Non-Uniform Memory Access (NUMA) is becoming increasingly prevalent...
Processors with multiple sockets or chiplets are becoming more conventional. These kinds of processo...
International audienceModern multicore systems are based on a Non-Uniform Memory Access (NUMA) desig...
Modern multicore systems are based on a Non-Uniform Memory Access (NUMA) design. In a NUMA system, c...
This paper introduces two novel algorithms for thread migrations, named CIMAR (Core-aware Interchang...
A common approach to improve memory access in NUMA machines exploits operating system (OS) page prot...
Within the last decade, microprocessor development reached a point at which higher clock rates and m...
A multiprocessor system with uniform memory access is difficult to scale due to the increasing conte...
Multi-core platforms with non-uniform memory access (NUMA) design are now a common resource in High ...
Nowadays, on hierarchical shared memory multiprocessors with Non-Uniform Memory Access (NUMA), the n...
Current high-performance multicore processors provide users with a non-uniform memory access model (...
International audienceNon Uniform Memory Access (NUMA) architectures are nowadays common for running...
International audienceExploiting the full computational power of current hierarchical multiprocessor...
Multi-core nodes with Non-Uniform Memory Access (NUMA) are now a common architecture for high perfor...
International audienceIn modern parallel architectures, memory accesses represent a common bottlenec...
As the number of cores increases Non-Uniform Memory Access (NUMA) is becoming increasingly prevalent...
Processors with multiple sockets or chiplets are becoming more conventional. These kinds of processo...
International audienceModern multicore systems are based on a Non-Uniform Memory Access (NUMA) desig...
Modern multicore systems are based on a Non-Uniform Memory Access (NUMA) design. In a NUMA system, c...
This paper introduces two novel algorithms for thread migrations, named CIMAR (Core-aware Interchang...
A common approach to improve memory access in NUMA machines exploits operating system (OS) page prot...
Within the last decade, microprocessor development reached a point at which higher clock rates and m...
A multiprocessor system with uniform memory access is difficult to scale due to the increasing conte...
Multi-core platforms with non-uniform memory access (NUMA) design are now a common resource in High ...
Nowadays, on hierarchical shared memory multiprocessors with Non-Uniform Memory Access (NUMA), the n...
Current high-performance multicore processors provide users with a non-uniform memory access model (...
International audienceNon Uniform Memory Access (NUMA) architectures are nowadays common for running...
International audienceExploiting the full computational power of current hierarchical multiprocessor...
Multi-core nodes with Non-Uniform Memory Access (NUMA) are now a common architecture for high perfor...
International audienceIn modern parallel architectures, memory accesses represent a common bottlenec...
As the number of cores increases Non-Uniform Memory Access (NUMA) is becoming increasingly prevalent...
Processors with multiple sockets or chiplets are becoming more conventional. These kinds of processo...