Stencil computation (SC) is of critical importance for broad scientific and engineering applications. However, it is a challenge to optimize complex, high-order SC on emerging clusters of multicore processors. We have developed a hierarchical SC parallelization framework that combines: (1) spatial decomposition based on message passing; (2) multithreading using critical section-free, dual representation; and (3) single-instruction multiple-data (SIMD) parallelism based on various code transformations. Our SIMD transformations include translocated statement fusion, vector composition via shuffle, and vectorized data layout reordering (e.g. matrix transpose), which are combined with traditional optimization techniques such as loop unrolling. ...
As the cost of data movement increasingly dominates performance, developers of finite-volume and fin...
Physics-based simulation, Computational Fluid Dynamics (CFD) in particular, has substantially reshap...
This paper describes the design and the performance of DOMINO, a 3D Cartesian SN solver that impleme...
2011-07-13The advent of multi-core/many-core paradigm has provided unprecedented computing power, an...
Understanding the most efficient design and utilization of emerging multicore systems is one of the ...
2012-04-27The shift to many-core architecture design paradigm in computer market has provided unprec...
New algorithms and optimization techniques are needed to balance the accelerating trend towards band...
International audienceStencil computation represents an important numerical kernel in scientific com...
PDE discretization schemes yielding stencil-like computing patterns are commonly used for seismic mo...
In this paper we address the problem of identifying and exploiting techniques that optimize the perf...
Communicated by Guest Editors The implementation of stencil computations on modern, massively parall...
Lattice-Boltzmann method(LBM), a promising new particle-based simulation technique for complex and m...
Current development trends of fast processors calls for an increasing number of cores, each core fea...
We present a software approach to hardware-oriented numerics which builds upon an augmented, previou...
Application codes reliably achieve performance far less than the advertised capabilities of existing...
As the cost of data movement increasingly dominates performance, developers of finite-volume and fin...
Physics-based simulation, Computational Fluid Dynamics (CFD) in particular, has substantially reshap...
This paper describes the design and the performance of DOMINO, a 3D Cartesian SN solver that impleme...
2011-07-13The advent of multi-core/many-core paradigm has provided unprecedented computing power, an...
Understanding the most efficient design and utilization of emerging multicore systems is one of the ...
2012-04-27The shift to many-core architecture design paradigm in computer market has provided unprec...
New algorithms and optimization techniques are needed to balance the accelerating trend towards band...
International audienceStencil computation represents an important numerical kernel in scientific com...
PDE discretization schemes yielding stencil-like computing patterns are commonly used for seismic mo...
In this paper we address the problem of identifying and exploiting techniques that optimize the perf...
Communicated by Guest Editors The implementation of stencil computations on modern, massively parall...
Lattice-Boltzmann method(LBM), a promising new particle-based simulation technique for complex and m...
Current development trends of fast processors calls for an increasing number of cores, each core fea...
We present a software approach to hardware-oriented numerics which builds upon an augmented, previou...
Application codes reliably achieve performance far less than the advertised capabilities of existing...
As the cost of data movement increasingly dominates performance, developers of finite-volume and fin...
Physics-based simulation, Computational Fluid Dynamics (CFD) in particular, has substantially reshap...
This paper describes the design and the performance of DOMINO, a 3D Cartesian SN solver that impleme...