We study the performance behaviour of a seismic simulation using the ExaHyPE engine with a specific focus on memory characteristics and energy needs. ExaHyPE combines dynamically adaptive mesh refinement (AMR) with ADER-DG. It is parallelized using tasks, and it is cache efficient. AMR plus ADER-DG yields a task graph which is highly dynamic in nature and comprises both arithmetically expensive tasks and tasks which challenge the memory’s latency. The expensive tasks and thus the whole code benefit from AVX vectorization, although we suffer from memory access bursts. A frequency reduction of the chip improves the code’s energy-to-solution. Yet, it does not mitigate burst effects. The bursts’ latency penalty becomes worse once we add Intel O...
The computational resources required in scientific research for key areas, such as medicine, physics...
This thesis proposes novel, efficient execution-paradigms for parallel heterogeneous architectures. ...
Multi-core processors have become the dominant processor architecture with 2, 4, and 8 cores on a ch...
In an era where we can not afford to checkpoint frequently, replication is a generic way forward to ...
Balancing the workload of sophisticated simulations is inherently difficult, since we have to balanc...
International audienceComputing hardware, from mobile devices to supercomputer clusters, is undergoi...
We study codes deploying multiple MPI ranks to one node where each rank is parallelised with TBB. A...
Recent many-core processors such as Intel’s Xeon Phi and GPGPUs specialize in running highly scalabl...
In the march towards exascale, supercomputer architectures are undergoing a significant change. Limi...
Architects have adopted the shared memory model that implicitly manages cache coherence and cache ca...
With the advent of manycore systems, shared memory parallelisation has gained importance in high per...
International audienceThis poster presents the use of Damaris, an I/O middleware for post-petascale ...
Computer simulation has become increasingly important in many scientiï¬c disciplines, but its perfor...
International audienceParallelizing industrial simulation codes like the EUROPLEXUS software dedicat...
New architectures for extreme-scale computing need to be designed for higher energy efficiency than ...
The computational resources required in scientific research for key areas, such as medicine, physics...
This thesis proposes novel, efficient execution-paradigms for parallel heterogeneous architectures. ...
Multi-core processors have become the dominant processor architecture with 2, 4, and 8 cores on a ch...
In an era where we can not afford to checkpoint frequently, replication is a generic way forward to ...
Balancing the workload of sophisticated simulations is inherently difficult, since we have to balanc...
International audienceComputing hardware, from mobile devices to supercomputer clusters, is undergoi...
We study codes deploying multiple MPI ranks to one node where each rank is parallelised with TBB. A...
Recent many-core processors such as Intel’s Xeon Phi and GPGPUs specialize in running highly scalabl...
In the march towards exascale, supercomputer architectures are undergoing a significant change. Limi...
Architects have adopted the shared memory model that implicitly manages cache coherence and cache ca...
With the advent of manycore systems, shared memory parallelisation has gained importance in high per...
International audienceThis poster presents the use of Damaris, an I/O middleware for post-petascale ...
Computer simulation has become increasingly important in many scientiï¬c disciplines, but its perfor...
International audienceParallelizing industrial simulation codes like the EUROPLEXUS software dedicat...
New architectures for extreme-scale computing need to be designed for higher energy efficiency than ...
The computational resources required in scientific research for key areas, such as medicine, physics...
This thesis proposes novel, efficient execution-paradigms for parallel heterogeneous architectures. ...
Multi-core processors have become the dominant processor architecture with 2, 4, and 8 cores on a ch...