Current development trends of fast processors calls for an increasing number of cores, each core featuring wide vector processing units. Applications must then exploit both directions of parallelism to run efficiently. In this work we focus on the efficient use of vector instructions. These process several data-elements in parallel, and memory data layout plays an important role to make this efficient. An optimal memorylayout depends in principle on the access patterns of the algorithm but also on the architectural features of the processor. However, different parts of the application may have different requirements, and then the choice of the most efficient data-structure for vectorization has to be carefully assessed. We address these pro...
AbstractWe present different kernels based on Lattice-Boltzmann methods for the solution of the two-...
AbstractLattice Boltzmann (LB) methods are a class of Computational Fluid Dynamics (CFD) methods for...
Accelerators are an increasingly common option to boost performance of codes that require extensive ...
In this paper we address the problem of identifying and exploiting techniques that optimize the perf...
Abstract We develop a Lattice Boltzmann code for computational fluid-dynamics and optimize it for ma...
We describe the implementation and optimization of a state-of-the-art Lattice Boltzmann code for com...
We develop a Lattice Boltzmann code for computational fluid-dynamics and optimize it for massively p...
AbstractWe develop a Lattice Boltzmann code for computational fluid-dynamics and optimize it for mas...
This thesis presents efforts to attain efficient Lattice Boltzmann simulations on large-scale parall...
Numerical analysts and programmers are currently facing a conceptual change in processor technology....
When designing and implementing highly efficient scientific applications for parallel computers such...
We present different kernels based on Lattice-Boltzmann methods for the solution of the two-dimensio...
The lattice Boltzmann method has become a valuable tool in computational fluid dynamics, one of the ...
In this paper we report on our early experience on porting, optimizing and benchmarking a Lattice Bo...
AbstractIn this paper we report on our early experience on porting, optimizing and benchmarking a La...
AbstractWe present different kernels based on Lattice-Boltzmann methods for the solution of the two-...
AbstractLattice Boltzmann (LB) methods are a class of Computational Fluid Dynamics (CFD) methods for...
Accelerators are an increasingly common option to boost performance of codes that require extensive ...
In this paper we address the problem of identifying and exploiting techniques that optimize the perf...
Abstract We develop a Lattice Boltzmann code for computational fluid-dynamics and optimize it for ma...
We describe the implementation and optimization of a state-of-the-art Lattice Boltzmann code for com...
We develop a Lattice Boltzmann code for computational fluid-dynamics and optimize it for massively p...
AbstractWe develop a Lattice Boltzmann code for computational fluid-dynamics and optimize it for mas...
This thesis presents efforts to attain efficient Lattice Boltzmann simulations on large-scale parall...
Numerical analysts and programmers are currently facing a conceptual change in processor technology....
When designing and implementing highly efficient scientific applications for parallel computers such...
We present different kernels based on Lattice-Boltzmann methods for the solution of the two-dimensio...
The lattice Boltzmann method has become a valuable tool in computational fluid dynamics, one of the ...
In this paper we report on our early experience on porting, optimizing and benchmarking a Lattice Bo...
AbstractIn this paper we report on our early experience on porting, optimizing and benchmarking a La...
AbstractWe present different kernels based on Lattice-Boltzmann methods for the solution of the two-...
AbstractLattice Boltzmann (LB) methods are a class of Computational Fluid Dynamics (CFD) methods for...
Accelerators are an increasingly common option to boost performance of codes that require extensive ...