International audienceParallelizing industrial simulation codes like the EUROPLEXUS software dedicated to the analysis of fast transient phenomena, is challenging. In this paper we focus on the efficient parallelization on a multi-core shared memory node. We propose to have each thread gather the data it needs for processing a given iteration range, before to actually advance the computation by one time step on this range. This lazy cache aware layout construction enables to keep the original data structure and leads to very localised code modifications. We show that this approach can improve the execution time by up to 40% when the task size is set to have the data fit in the L2 cache
Cache hierarchies are increasingly non-uniform, so for systems to scale efficiently, data must be cl...
International audienceThis report presents a study of techniques used to speedup a scientific simula...
Preserving memory locality is a major issue in highly-multithreaded architectures such as GPUs. Thes...
AbstractParallelizing industrial simulation codes like the EUROPLEXUS software dedicated to the anal...
International audienceParallelizing industrial simulation codes like the EUROPLEXUS software dedicat...
Scientific and industrial applications that need high computational performance to be used are alway...
Enhancing the match between software executions and hardware features is key to computing efficiency...
In hardware/software codesign, Discrete Event Simulation (DES) has been in use for decades to verify...
We study the performance behaviour of a seismic simulation using the ExaHyPE engine with a specific ...
In this paper we explore the idea of customizing and reusing loop schedules to improve the scalabili...
Maximal utilization of cores in multicore architectures is key to realize the potential performance ...
As transistor density continues to grow geometrically, processor manufacturers are already able to p...
International audienceThe paper presents X-Kaapi, a compact runtime for multicore architec- tures th...
International audienceThe use of multi-core architectures in real-time systems raises new issues reg...
In this paper we propose a methodology for the study of general cache networks, which is intrinsical...
Cache hierarchies are increasingly non-uniform, so for systems to scale efficiently, data must be cl...
International audienceThis report presents a study of techniques used to speedup a scientific simula...
Preserving memory locality is a major issue in highly-multithreaded architectures such as GPUs. Thes...
AbstractParallelizing industrial simulation codes like the EUROPLEXUS software dedicated to the anal...
International audienceParallelizing industrial simulation codes like the EUROPLEXUS software dedicat...
Scientific and industrial applications that need high computational performance to be used are alway...
Enhancing the match between software executions and hardware features is key to computing efficiency...
In hardware/software codesign, Discrete Event Simulation (DES) has been in use for decades to verify...
We study the performance behaviour of a seismic simulation using the ExaHyPE engine with a specific ...
In this paper we explore the idea of customizing and reusing loop schedules to improve the scalabili...
Maximal utilization of cores in multicore architectures is key to realize the potential performance ...
As transistor density continues to grow geometrically, processor manufacturers are already able to p...
International audienceThe paper presents X-Kaapi, a compact runtime for multicore architec- tures th...
International audienceThe use of multi-core architectures in real-time systems raises new issues reg...
In this paper we propose a methodology for the study of general cache networks, which is intrinsical...
Cache hierarchies are increasingly non-uniform, so for systems to scale efficiently, data must be cl...
International audienceThis report presents a study of techniques used to speedup a scientific simula...
Preserving memory locality is a major issue in highly-multithreaded architectures such as GPUs. Thes...