AbstractParallelizing industrial simulation codes like the EUROPLEXUS software dedicated to the analysis of fast transient phenomena, is challenging. In this paper we focus on the efficient parallelization on a multi-core shared memory node. We propose to have each thread gather the data it needs for processing a given iteration range, before to actually advance the computation by one time step on this range. This lazy cache aware layout construction enables to keep the original data structure and leads to very localised code modifications. We show that this approach can improve the execution time by up to 40% when the task size is set to have the data fit in the L2 cache
The gap between processor speed and memory latency has led to the use of caches in the memory system...
In this paper we propose a methodology for the study of general cache networks, which is intrinsical...
The performance and energy efficiency of modern architectures depend on memory locality, which can b...
International audienceParallelizing industrial simulation codes like the EUROPLEXUS software dedicat...
Scientific and industrial applications that need high computational performance to be used are alway...
In hardware/software codesign, Discrete Event Simulation (DES) has been in use for decades to verify...
We study the performance behaviour of a seismic simulation using the ExaHyPE engine with a specific ...
Enhancing the match between software executions and hardware features is key to computing efficiency...
We present a new technique for the parallel simulation of cache coherent shared memory multiprocess...
An application’s cache miss rate is used in timing analysis, system performance prediction and ...
Shared-memory multi-processor/multi-core machines have become a reference for many application conte...
Architects have adopted the shared memory model that implicitly manages cache coherence and cache ca...
In this paper we explore the idea of customizing and reusing loop schedules to improve the scalabili...
As transistor density continues to grow geometrically, processor manufacturers are already able to p...
Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016)...
The gap between processor speed and memory latency has led to the use of caches in the memory system...
In this paper we propose a methodology for the study of general cache networks, which is intrinsical...
The performance and energy efficiency of modern architectures depend on memory locality, which can b...
International audienceParallelizing industrial simulation codes like the EUROPLEXUS software dedicat...
Scientific and industrial applications that need high computational performance to be used are alway...
In hardware/software codesign, Discrete Event Simulation (DES) has been in use for decades to verify...
We study the performance behaviour of a seismic simulation using the ExaHyPE engine with a specific ...
Enhancing the match between software executions and hardware features is key to computing efficiency...
We present a new technique for the parallel simulation of cache coherent shared memory multiprocess...
An application’s cache miss rate is used in timing analysis, system performance prediction and ...
Shared-memory multi-processor/multi-core machines have become a reference for many application conte...
Architects have adopted the shared memory model that implicitly manages cache coherence and cache ca...
In this paper we explore the idea of customizing and reusing loop schedules to improve the scalabili...
As transistor density continues to grow geometrically, processor manufacturers are already able to p...
Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016)...
The gap between processor speed and memory latency has led to the use of caches in the memory system...
In this paper we propose a methodology for the study of general cache networks, which is intrinsical...
The performance and energy efficiency of modern architectures depend on memory locality, which can b...