Due to the available concurrency in modern-day supercomputers, the complexity of developing efficient parallel applications for these platforms has grown rapidly in the last years. Many applications use message passing for parallelization, offering three main communication paradigms: point-to-point, collective and one-sided communication. Each paradigm fits certain domains of algorithms and communication patterns best. The one-sided paradigm decouples communication and synchronization and allows a single process to define a complete communication. These are important features for runtime systems of new programming paradigms and state-of-the-art dynamic load-balancing strategies. In any process interaction, wait states can occur, where a pro...
One-sided communication in MPI requires the use of one of three different synchro-nization mechanism...
In the world of message-passing distributed computing, reliable synchronous systems and asyn-chronou...
. We present two tests for analyzing deadlock for a class of communicating sequential processes. The...
Driven by growing application requirements and accelerated by current trends in microprocessor desig...
To better understand the formation of wait states in MPI programs and to support the user in finding...
Performance analysis is an essential part of the development process of HPC applications. Thus, deve...
The amount of parallelism in modern supercomputers currently grows from generation to generation, an...
The amount of parallelism in modern supercomputers currently grows from generation to generation. Fu...
n recent years, one-sided communication has emerged as an alternative to message-based communicatio...
Through analysis and experiments, this paper investigates two-phase waiting algorithms to minimize t...
Barrier primitives provided by standard parallel programming APIs are the primary means by which app...
ABSTRACT: We propose a new class of profiler for distributed and heterogeneous systems. In these sys...
Load imbalance usually introduces wait states into the execution of parallel programs. Being able to...
In this thesis, we studied the behavior of parallel programs to understand how to automated the task...
Hiding communication latency is an important optimization for parallel programs. Programmers or com...
One-sided communication in MPI requires the use of one of three different synchro-nization mechanism...
In the world of message-passing distributed computing, reliable synchronous systems and asyn-chronou...
. We present two tests for analyzing deadlock for a class of communicating sequential processes. The...
Driven by growing application requirements and accelerated by current trends in microprocessor desig...
To better understand the formation of wait states in MPI programs and to support the user in finding...
Performance analysis is an essential part of the development process of HPC applications. Thus, deve...
The amount of parallelism in modern supercomputers currently grows from generation to generation, an...
The amount of parallelism in modern supercomputers currently grows from generation to generation. Fu...
n recent years, one-sided communication has emerged as an alternative to message-based communicatio...
Through analysis and experiments, this paper investigates two-phase waiting algorithms to minimize t...
Barrier primitives provided by standard parallel programming APIs are the primary means by which app...
ABSTRACT: We propose a new class of profiler for distributed and heterogeneous systems. In these sys...
Load imbalance usually introduces wait states into the execution of parallel programs. Being able to...
In this thesis, we studied the behavior of parallel programs to understand how to automated the task...
Hiding communication latency is an important optimization for parallel programs. Programmers or com...
One-sided communication in MPI requires the use of one of three different synchro-nization mechanism...
In the world of message-passing distributed computing, reliable synchronous systems and asyn-chronou...
. We present two tests for analyzing deadlock for a class of communicating sequential processes. The...