The available memory bandwidth of existing high performance computing platforms turns out as being more and more the limitation to various applications. Therefore, modern microarchitectures integrate the memory controller on the processor chip, which leads to a non-uniform memory access behavior of such systems. This access behavior in turn entails major challenges in the development of shared memory parallel applications. An improperly implemented memory access functionality results in a bad ratio between local and remote memory access, and causes low performance on such architectures. To address this problem, the developers of such applications rely on tools to make these kinds of performance problems visible. This work presents a new too...
Since a few decades, to reduce energy consumption, processor vendors builds more and more parallel c...
Today’s microprocessors include multicores that feature a diverse set of compute cores and onboard m...
International audienceCurrent and future architectures rely on thread-level parallelism to sustain p...
Shared memory applications running transparently on top of NUMA architectures often face severe perf...
International audienceNon Uniform Memory Access (NUMA) architectures are nowadays common for running...
International audienceWe show how to analyze the locality of memory accesses usingAftermath, an open...
International audienceIn modern parallel architectures, memory accesses represent a common bottlenec...
Abstract. OpenMP has become the dominant standard for shared memory pro-gramming. It is traditionall...
International audienceModeling and simulation are crucial in high-performance computing (HPC), with ...
In scalable multiprocessor architectures, the times required for a processor to access various porti...
As multicore architectures become mainstream, an in-depth understanding of how applications behave o...
The demand for large compute capabilities in scientific computing led to wide use and acceptance of ...
The growing gap between processor and memory speeds has lead to complex memory hierarchies as proces...
Scalable multiprocessors that support a shared-memory image to application programmers are typically...
Since a few decades, to reduce energy consumption, processor vendors builds more and more parallel c...
Today’s microprocessors include multicores that feature a diverse set of compute cores and onboard m...
International audienceCurrent and future architectures rely on thread-level parallelism to sustain p...
Shared memory applications running transparently on top of NUMA architectures often face severe perf...
International audienceNon Uniform Memory Access (NUMA) architectures are nowadays common for running...
International audienceWe show how to analyze the locality of memory accesses usingAftermath, an open...
International audienceIn modern parallel architectures, memory accesses represent a common bottlenec...
Abstract. OpenMP has become the dominant standard for shared memory pro-gramming. It is traditionall...
International audienceModeling and simulation are crucial in high-performance computing (HPC), with ...
In scalable multiprocessor architectures, the times required for a processor to access various porti...
As multicore architectures become mainstream, an in-depth understanding of how applications behave o...
The demand for large compute capabilities in scientific computing led to wide use and acceptance of ...
The growing gap between processor and memory speeds has lead to complex memory hierarchies as proces...
Scalable multiprocessors that support a shared-memory image to application programmers are typically...
Since a few decades, to reduce energy consumption, processor vendors builds more and more parallel c...
Today’s microprocessors include multicores that feature a diverse set of compute cores and onboard m...
International audienceCurrent and future architectures rely on thread-level parallelism to sustain p...