Both NUMA thread/data placement and hardware prefetcher configuration have significant impacts on HPC performance. Optimizing both together leads to a large and complex design space that has previously been impractical to explore at runtime. In this work we deliver the performance benefits of optimizing both NUMA thread/data placement and prefetcher configuration at runtime through careful modeling and online profiling. To address the large design space, we propose a prediction model that reduces the amount of input information needed and the complexity of the prediction required. We do so by selecting a subset of performance counters and application configurations that provide the richest profile information as inputs, and by limiting the ...
Performance bottlenecks across distributed nodes, such as in high performance computing grids or clo...
Modern architectures provide hardware memory prefetching capabilities which can be configured at run...
As the digitisation of the world progresses at an accelerating pace, an overwhelming quantity of dat...
Both NUMA thread/data placement and hardware prefetcher configuration have significant impacts on HP...
Both NUMA thread/data placement and hardware prefetcher configuration have significant impacts on HP...
HPC systems expose configuration options that help users optimize their applications' execution. Que...
International audienceThere is a large space of NUMA and hardware prefetcher configurations that can...
Abstract—Modern processors are equipped with multiple hardware prefetchers, each of which targets a ...
International audienceNon Uniform Memory Access (NUMA) architectures are nowadays common for running...
An important technique for alleviating the memory bottleneck is data prefetching. Data prefetching ...
International audienceNowadays, NUMA architectures are common in compute-intensive systems. Achievin...
Hardware prefetching on IBM’s latest POWER8 processor is able to improve performance of many applica...
The benefits of prefetching have been largely overshadowed by the overhead required to produce high...
tures are ubiquitous in HPC systems. NUMA along with other factors including socket layout, data pla...
International audienceData prefetching is an effective way to bridge the increasing performance gap ...
Performance bottlenecks across distributed nodes, such as in high performance computing grids or clo...
Modern architectures provide hardware memory prefetching capabilities which can be configured at run...
As the digitisation of the world progresses at an accelerating pace, an overwhelming quantity of dat...
Both NUMA thread/data placement and hardware prefetcher configuration have significant impacts on HP...
Both NUMA thread/data placement and hardware prefetcher configuration have significant impacts on HP...
HPC systems expose configuration options that help users optimize their applications' execution. Que...
International audienceThere is a large space of NUMA and hardware prefetcher configurations that can...
Abstract—Modern processors are equipped with multiple hardware prefetchers, each of which targets a ...
International audienceNon Uniform Memory Access (NUMA) architectures are nowadays common for running...
An important technique for alleviating the memory bottleneck is data prefetching. Data prefetching ...
International audienceNowadays, NUMA architectures are common in compute-intensive systems. Achievin...
Hardware prefetching on IBM’s latest POWER8 processor is able to improve performance of many applica...
The benefits of prefetching have been largely overshadowed by the overhead required to produce high...
tures are ubiquitous in HPC systems. NUMA along with other factors including socket layout, data pla...
International audienceData prefetching is an effective way to bridge the increasing performance gap ...
Performance bottlenecks across distributed nodes, such as in high performance computing grids or clo...
Modern architectures provide hardware memory prefetching capabilities which can be configured at run...
As the digitisation of the world progresses at an accelerating pace, an overwhelming quantity of dat...