Our study proposes a Reducing-size Task Assignation technique (RTA), which is a novel approach to solve the grain-size problem for the hybrid MPI-OpenMP thread-to-thread (hybrid TC) programming model in performing distributed matrix mulitplication on SMP PC clusters. Applying RTA, hybrid TC achieves an acceptable computation performance while retaining the dynamic task scheduling capability, thereby it can yield a 22 % performance improvement for a 16-node cluster of Xeon dual-processor SMPs in comparison with the pure MPI model. Moreover, we provide formulas to predict hybrid TC performance in different circumstances. 1
In this paper we discuss the application of an hybrid programming paradigm that combines message-pas...
The parallelization process of nested-loop algorithms onto popular multi-level parallel architectur...
International audienceWhile task-based programming, such as OpenMP, is a promising solution to explo...
This paper applies a Hybrid MPI-OpenMP program-ming model with a thread-to-thread communication meth...
The multiplication of large spare matrices is a basic operation for many scientific and engineering ...
Our study proposes a novel MPI-only parallel programming model with improved performance for SMP clu...
The mixed-mode OpenMP and MPI programming models in parallel application have significant impact on ...
The mixing of shared memory and message passing programming models within a single application has o...
After a brief introduction on Cross Motif Search and its OpenMP and Hybrid OpenMP-MPI implementatio...
The modern computer-systems designed according to multiprocessor configurations. Multiple processors...
The mixing of shared memory and message passing programming models within a single application has o...
International audienceSince the last decade, most of the supercomputer architectures are based on cl...
The multiplication of a vector by a matrix is the kernel operation in many algorithms used in scient...
Most HPC systems are clusters of shared memory nodes. Parallel programming must combine the distribu...
Abstract. We consider the realization of matrix-matrix multiplication and propose a hierarchical alg...
In this paper we discuss the application of an hybrid programming paradigm that combines message-pas...
The parallelization process of nested-loop algorithms onto popular multi-level parallel architectur...
International audienceWhile task-based programming, such as OpenMP, is a promising solution to explo...
This paper applies a Hybrid MPI-OpenMP program-ming model with a thread-to-thread communication meth...
The multiplication of large spare matrices is a basic operation for many scientific and engineering ...
Our study proposes a novel MPI-only parallel programming model with improved performance for SMP clu...
The mixed-mode OpenMP and MPI programming models in parallel application have significant impact on ...
The mixing of shared memory and message passing programming models within a single application has o...
After a brief introduction on Cross Motif Search and its OpenMP and Hybrid OpenMP-MPI implementatio...
The modern computer-systems designed according to multiprocessor configurations. Multiple processors...
The mixing of shared memory and message passing programming models within a single application has o...
International audienceSince the last decade, most of the supercomputer architectures are based on cl...
The multiplication of a vector by a matrix is the kernel operation in many algorithms used in scient...
Most HPC systems are clusters of shared memory nodes. Parallel programming must combine the distribu...
Abstract. We consider the realization of matrix-matrix multiplication and propose a hierarchical alg...
In this paper we discuss the application of an hybrid programming paradigm that combines message-pas...
The parallelization process of nested-loop algorithms onto popular multi-level parallel architectur...
International audienceWhile task-based programming, such as OpenMP, is a promising solution to explo...