This paper presents an automatic counter instrumentation and pro ling module added to the MPI library on Cray T3E and SGI Origin2000 systems. A detailed summary of the hardware performance counters and the MPI calls of any MPI production program is gathered during execution and written in MPI Finalize on a special syslog file. The user can get the same information in a different file. Statistical summaries are computed weekly and monthly. The paper describes experiences with this library on the Cray T3E systems at HLRS Stuttgart and TU Dresden. It focuses on the problems integrating the hardware performance counters into MPI counter profiling and presents first results with these counters. Also, a second software design is described that al...
TOPAS is a tool to automatically and transparently monitor usage and performance of every parallel j...
Abstract. In this paper, we analyze existing MPI benchmarking suites, focusing on two restrictions t...
T3E-900, the Cray Origin 2000 and the IBM P2SC on a collection of 13 communication tests. These test...
Abstract. Performance profiling of MPI programs generates overhead during execution that introduces ...
We have developed a new MPI benchmark package called MPIBench that uses a very precise and portable ...
The desire for high performance on scalable parallel systems is increasing the complexity and the...
Event tracing of parallel programs can provide valuable information about program performance. The d...
For industrial systems performance, it is desired to keep the IT infrastructure competitive through ...
An MPI profiling library is a standard mechanism for intercepting MPI calls by applications. Profili...
An MPI profiling library is a standard mechanism for intercepting MPI calls by applications. Profili...
The purpose of the PAPI project is to specify a standard application programming interface (API) for...
The aim of the project is to develop a light-weight MPI profiling library that differentiates betwee...
Hardware performance counters are CPU registers that count data loads and stores, cache misses, and ...
The EXPERT performance-analysis environment provides a complete tracing-based solution for automatic...
In this work, a standard and unified method for monitoring hardware accelerators in Reconfigurable C...
TOPAS is a tool to automatically and transparently monitor usage and performance of every parallel j...
Abstract. In this paper, we analyze existing MPI benchmarking suites, focusing on two restrictions t...
T3E-900, the Cray Origin 2000 and the IBM P2SC on a collection of 13 communication tests. These test...
Abstract. Performance profiling of MPI programs generates overhead during execution that introduces ...
We have developed a new MPI benchmark package called MPIBench that uses a very precise and portable ...
The desire for high performance on scalable parallel systems is increasing the complexity and the...
Event tracing of parallel programs can provide valuable information about program performance. The d...
For industrial systems performance, it is desired to keep the IT infrastructure competitive through ...
An MPI profiling library is a standard mechanism for intercepting MPI calls by applications. Profili...
An MPI profiling library is a standard mechanism for intercepting MPI calls by applications. Profili...
The purpose of the PAPI project is to specify a standard application programming interface (API) for...
The aim of the project is to develop a light-weight MPI profiling library that differentiates betwee...
Hardware performance counters are CPU registers that count data loads and stores, cache misses, and ...
The EXPERT performance-analysis environment provides a complete tracing-based solution for automatic...
In this work, a standard and unified method for monitoring hardware accelerators in Reconfigurable C...
TOPAS is a tool to automatically and transparently monitor usage and performance of every parallel j...
Abstract. In this paper, we analyze existing MPI benchmarking suites, focusing on two restrictions t...
T3E-900, the Cray Origin 2000 and the IBM P2SC on a collection of 13 communication tests. These test...