To better understand the formation of wait states in MPI programs and to support the user in finding optimization targets in the case of load imbalance, a major source of wait states, we added in our earlier work two new trace-analysis techniques to Scalasca, a performance analysis tool designed for large-scale applications. In this paper, we show how the two techniques, which were originally restricted to two-sided and collective MPI communication, are extended to cover also one-sided communication. We demonstrate our experiences with benchmark programs and a mini-application representing the core of the POP ocean model
n recent years, one-sided communication has emerged as an alternative to message-based communicatio...
We examine the mechanics of the send and receive mechanism of MPI and in particular how we can impl...
We examine the mechanics of the send and receive mechanism of MPI and in particular how we can imple...
Due to the available concurrency in modern-day supercomputers, the complexity of developing efficien...
Performance analysis is an essential part of the development process of HPC applications. Thus, deve...
The amount of parallelism in modern supercomputers currently grows from generation to generation, an...
The amount of parallelism in modern supercomputers currently grows from generation to generation. Fu...
Driven by growing application requirements and accelerated by current trends in microprocessor desig...
Load imbalance usually introduces wait states into the execution of parallel programs. Being able to...
Abstract. The one-sided communication model supported by MPI-2 can be more convenient to use than th...
One-sided communication in MPI requires the use of one of three different synchro-nization mechanism...
Partitioned global address space (PGAS) languages combine the convenient abstraction of shared memor...
The Message Passing Interface (MPI) is the standard API for parallelization in high-performance and ...
In this report we describe how to improve communication time of MPI parallel applications with the u...
In this paper we evaluate the current status and perfor-mance of several MPI implementations regardi...
n recent years, one-sided communication has emerged as an alternative to message-based communicatio...
We examine the mechanics of the send and receive mechanism of MPI and in particular how we can impl...
We examine the mechanics of the send and receive mechanism of MPI and in particular how we can imple...
Due to the available concurrency in modern-day supercomputers, the complexity of developing efficien...
Performance analysis is an essential part of the development process of HPC applications. Thus, deve...
The amount of parallelism in modern supercomputers currently grows from generation to generation, an...
The amount of parallelism in modern supercomputers currently grows from generation to generation. Fu...
Driven by growing application requirements and accelerated by current trends in microprocessor desig...
Load imbalance usually introduces wait states into the execution of parallel programs. Being able to...
Abstract. The one-sided communication model supported by MPI-2 can be more convenient to use than th...
One-sided communication in MPI requires the use of one of three different synchro-nization mechanism...
Partitioned global address space (PGAS) languages combine the convenient abstraction of shared memor...
The Message Passing Interface (MPI) is the standard API for parallelization in high-performance and ...
In this report we describe how to improve communication time of MPI parallel applications with the u...
In this paper we evaluate the current status and perfor-mance of several MPI implementations regardi...
n recent years, one-sided communication has emerged as an alternative to message-based communicatio...
We examine the mechanics of the send and receive mechanism of MPI and in particular how we can impl...
We examine the mechanics of the send and receive mechanism of MPI and in particular how we can imple...