International audienceThe advent to exascale requires more scalable and efficient techniques to help developers to locate, analyze and correct errors in parallel applications. PARallel COntrol flow Anomaly CHecker (PARCOACH) is a framework that detects the origin of collective errors in applications using MPI and/or OpenMP. In MPI, such errors include collective operations mismatches. In OpenMP, a collective error can be a barrier not called by all tasks in a team. In this paper, we present an extension of PARCOACH which improves its collective errors detection. We show our analysis is more precise and accurate than the previous one on different benchmarks and real applications
Algorithms are presented for detecting errors and anomalies in programs which use synchronization co...
MPI is the de-facto standard message-passing based parallel programming model. However, the bug dete...
Abstract: Identifying the anomalies is a critical task to maintain the uptime of the monitored distr...
International audienceThe advent to exascale requires more scalable and efficient techniques to help...
International audienceDetermining if a parallel program behaves as expected on any execution is chal...
International audienceSupercomputers are rapidly evolving with now millions of processing units, pos...
International audienceScientific applications mainly rely on the MPI parallel programming model to r...
International audienceNowadays most scientific applications are parallelized based on MPI communicat...
International audienceThe Message Passing Interface (MPI) is a parallel programming model used to ex...
International audienceMPI is the most widely used parallel programming model. But the reducing amoun...
International audienceMPI-3 provide functions for non-blocking collectives. To help programmers intr...
An MPI profiling library is a standard mechanism for intercepting MPI calls by applications. Profili...
An MPI profiling library is a standard mechanism for intercepting MPI calls by applications. Profili...
An MPI profiling library is a standard mechanism for inter-cepting MPI calls by applications. Profil...
Increasing computational demand of simulations motivates the use of parallel computing systems. At t...
Algorithms are presented for detecting errors and anomalies in programs which use synchronization co...
MPI is the de-facto standard message-passing based parallel programming model. However, the bug dete...
Abstract: Identifying the anomalies is a critical task to maintain the uptime of the monitored distr...
International audienceThe advent to exascale requires more scalable and efficient techniques to help...
International audienceDetermining if a parallel program behaves as expected on any execution is chal...
International audienceSupercomputers are rapidly evolving with now millions of processing units, pos...
International audienceScientific applications mainly rely on the MPI parallel programming model to r...
International audienceNowadays most scientific applications are parallelized based on MPI communicat...
International audienceThe Message Passing Interface (MPI) is a parallel programming model used to ex...
International audienceMPI is the most widely used parallel programming model. But the reducing amoun...
International audienceMPI-3 provide functions for non-blocking collectives. To help programmers intr...
An MPI profiling library is a standard mechanism for intercepting MPI calls by applications. Profili...
An MPI profiling library is a standard mechanism for intercepting MPI calls by applications. Profili...
An MPI profiling library is a standard mechanism for inter-cepting MPI calls by applications. Profil...
Increasing computational demand of simulations motivates the use of parallel computing systems. At t...
Algorithms are presented for detecting errors and anomalies in programs which use synchronization co...
MPI is the de-facto standard message-passing based parallel programming model. However, the bug dete...
Abstract: Identifying the anomalies is a critical task to maintain the uptime of the monitored distr...