This paper deals with the performance prediction of hybrid MPI/OpenMP code. The use of HeSSE (Heterogeneous System Simulation Environment), along with an XML-based prototype language, MetaPL, makes it possible to predict hybrid application performance in many different working conditions, e.g., without the fully developed code or in an unavailable system. After a review of hybrid programming techniques and a brief overview of the HeSSE simulation environment, the problems related to the simulation of hybrid code and to its description through trace files are dealt with. The whole application modeling and analysis cycle is presented and validated, predicting the performance of a parallel N-body code on a SMP cluster and comparing it to the t...
International audienceFinely tuning MPI applications (number of processes, granularity, collectiveop...
This paper is mainly a summary of two years of my research. I will start from the basic theory of th...
Overview Most HPC systems are clusters of shared memory nodes. To use such systems efficiently both...
This paper deals with the performance prediction of hybrid OpenMP/MPI code. After a brief overview o...
The mixing of shared memory and message passing programming models within a single application has o...
The mixed-mode OpenMP and MPI programming models in parallel application have significant impact on ...
Clusters of symmetric multiprocessors (SMPs) are the most currently used architecture for large scal...
Several performance analysis tools support hybrid applications. Most originated as MPI profiling or ...
Abstract. The Hybrid method of parallelization (using MPI for inter-node communication and OpenMP fo...
This paper describes a simulation-based technique for the performance prediction of message-passing ...
The mixing of shared memory and message passing programming models within a single application has o...
The EXPERT performance-analysis environment provides a complete tracing-based solution for automatic...
Hybrid parallelization may be the only path for most codes to use HPC systems on a very large scale....
Hybrid programming, whereby shared memory and message passing programming techniques are combined wi...
Abstract. The paper describes some very early experiments on new ar-chitectures that support the hyb...
International audienceFinely tuning MPI applications (number of processes, granularity, collectiveop...
This paper is mainly a summary of two years of my research. I will start from the basic theory of th...
Overview Most HPC systems are clusters of shared memory nodes. To use such systems efficiently both...
This paper deals with the performance prediction of hybrid OpenMP/MPI code. After a brief overview o...
The mixing of shared memory and message passing programming models within a single application has o...
The mixed-mode OpenMP and MPI programming models in parallel application have significant impact on ...
Clusters of symmetric multiprocessors (SMPs) are the most currently used architecture for large scal...
Several performance analysis tools support hybrid applications. Most originated as MPI profiling or ...
Abstract. The Hybrid method of parallelization (using MPI for inter-node communication and OpenMP fo...
This paper describes a simulation-based technique for the performance prediction of message-passing ...
The mixing of shared memory and message passing programming models within a single application has o...
The EXPERT performance-analysis environment provides a complete tracing-based solution for automatic...
Hybrid parallelization may be the only path for most codes to use HPC systems on a very large scale....
Hybrid programming, whereby shared memory and message passing programming techniques are combined wi...
Abstract. The paper describes some very early experiments on new ar-chitectures that support the hyb...
International audienceFinely tuning MPI applications (number of processes, granularity, collectiveop...
This paper is mainly a summary of two years of my research. I will start from the basic theory of th...
Overview Most HPC systems are clusters of shared memory nodes. To use such systems efficiently both...