The parallelization process of nested-loop algorithms onto popular multi-level parallel architectures, such as clusters of SMPs, is not a trivial issue, since the existence of data dependencies in the algorithm impose severe restrictions on the task decomposition to be applied. In this paper we propose three techniques for the parallelization of such algorithms, namely pure MPI parallelization, fine-grain hybrid MPI/OpenMP parallelization and coarse-grain MPI/OpenMP parallelization. We further apply an advanced hyperplane scheduling scheme that enables pipelined execution and the overlapping of communication with useful computation, thus leading almost to full CPU utilization. We implement the three variations and perform a number ...
Abstract. The Hybrid method of parallelization (using MPI for inter-node communication and OpenMP fo...
The symmetric multiprocessing (SMP) cluster system, which consists of shared memory nodes with sever...
With the end of Dennard scaling, future high performance computers are expected to consist of distri...
[[abstract]]Multicore computers have been widely included in cluster systems. They are shared memory...
After a brief introduction on Cross Motif Search and its OpenMP and Hybrid OpenMP-MPI implementatio...
Overview Most HPC systems are clusters of shared memory nodes. To use such systems efficiently both...
This technical report is an introduction to using a hybrid parallel programming model that combines ...
Most HPC systems are clusters of shared memory nodes. Parallel programming must combine the distribu...
The mixed-mode OpenMP and MPI programming models in parallel application have significant impact on ...
Most HPC systems are clusters of shared memory nodes. To use such systems efficiently both memory co...
Most HPC systems are clusters of shared memory nodes. Parallel programming must combine the distribu...
Most high-performance, scientific libraries have adopted hybrid parallelization schemes - such as t...
This paper presents a complete framework for the parallelization of nested loops by applying tiling ...
This paper applies a Hybrid MPI-OpenMP program-ming model with a thread-to-thread communication meth...
Hybrid MPI/OpenMP and pure MPI on clusters of multi-core SMP nodes involve several mismatch problems...
Abstract. The Hybrid method of parallelization (using MPI for inter-node communication and OpenMP fo...
The symmetric multiprocessing (SMP) cluster system, which consists of shared memory nodes with sever...
With the end of Dennard scaling, future high performance computers are expected to consist of distri...
[[abstract]]Multicore computers have been widely included in cluster systems. They are shared memory...
After a brief introduction on Cross Motif Search and its OpenMP and Hybrid OpenMP-MPI implementatio...
Overview Most HPC systems are clusters of shared memory nodes. To use such systems efficiently both...
This technical report is an introduction to using a hybrid parallel programming model that combines ...
Most HPC systems are clusters of shared memory nodes. Parallel programming must combine the distribu...
The mixed-mode OpenMP and MPI programming models in parallel application have significant impact on ...
Most HPC systems are clusters of shared memory nodes. To use such systems efficiently both memory co...
Most HPC systems are clusters of shared memory nodes. Parallel programming must combine the distribu...
Most high-performance, scientific libraries have adopted hybrid parallelization schemes - such as t...
This paper presents a complete framework for the parallelization of nested loops by applying tiling ...
This paper applies a Hybrid MPI-OpenMP program-ming model with a thread-to-thread communication meth...
Hybrid MPI/OpenMP and pure MPI on clusters of multi-core SMP nodes involve several mismatch problems...
Abstract. The Hybrid method of parallelization (using MPI for inter-node communication and OpenMP fo...
The symmetric multiprocessing (SMP) cluster system, which consists of shared memory nodes with sever...
With the end of Dennard scaling, future high performance computers are expected to consist of distri...