Although modern supercomputers are composed of multicore machines, one can find scientists that still execute their legacy applications which were developed to monocore cluster where memory hierarchy is dedicated to a sole core. The main objective of this paper is to propose and evaluate an algorithm that identify an effi-cient blocksize to be applied on MPI stencil computations on multicore machines. Under the light of an extensive experimental analysis, this work shows the benefits of identifying blocksizes that will dividing data on the various cores and suggest a methodology that explore the memory hierarchy available in modern machines.
In the field of structured parallel programming we study and implement a shared-memory runtime suppo...
The convergence of highly parallel many-core graphics processors with conventional multi-core proces...
Application codes reliably achieve performance far less than the advertised capabilities of existing...
PRACE 2IP White PaperOn multi-core clusters or supercomputers, how to get good performance when runn...
New algorithms and optimization techniques are needed to balance the accelerating trend towards band...
High Performance Computing (HPC) can be defined as the practice of combining computing power to atta...
International audienceStencil computation represents an important numerical kernel in scientific com...
Stencil-based kernels constitute the core of many scientific applications on block-structured grids....
Abstract. The importance of stencil-based algorithms in computational science has focused attention ...
This paper describes a new technique for optimizing serial and parallel stencil- and stencil-like op...
The multicore era has initiated a move to ubiquitous parallelization of software. In the process, co...
Stencil computation (SC) is of critical importance for broad scientific and engineering applications...
Abstract. This paper proposes tiling techniques based on data depen-dencies and not in code structur...
Processors have become faster at a much quicker rate than memory access time, creating wide gap betw...
AbstractA current challenge for computer users is to fully exploit performance of new Multicore syst...
In the field of structured parallel programming we study and implement a shared-memory runtime suppo...
The convergence of highly parallel many-core graphics processors with conventional multi-core proces...
Application codes reliably achieve performance far less than the advertised capabilities of existing...
PRACE 2IP White PaperOn multi-core clusters or supercomputers, how to get good performance when runn...
New algorithms and optimization techniques are needed to balance the accelerating trend towards band...
High Performance Computing (HPC) can be defined as the practice of combining computing power to atta...
International audienceStencil computation represents an important numerical kernel in scientific com...
Stencil-based kernels constitute the core of many scientific applications on block-structured grids....
Abstract. The importance of stencil-based algorithms in computational science has focused attention ...
This paper describes a new technique for optimizing serial and parallel stencil- and stencil-like op...
The multicore era has initiated a move to ubiquitous parallelization of software. In the process, co...
Stencil computation (SC) is of critical importance for broad scientific and engineering applications...
Abstract. This paper proposes tiling techniques based on data depen-dencies and not in code structur...
Processors have become faster at a much quicker rate than memory access time, creating wide gap betw...
AbstractA current challenge for computer users is to fully exploit performance of new Multicore syst...
In the field of structured parallel programming we study and implement a shared-memory runtime suppo...
The convergence of highly parallel many-core graphics processors with conventional multi-core proces...
Application codes reliably achieve performance far less than the advertised capabilities of existing...