Characterizing the communication behavior of largescale applications is a difficult and costly task due to code and system complexity as well as their long execution times. An alternative to running actual codes is to gather their communication traces and then replay them, which facilitates application tuning and future procurements. While past approaches lacked lossless scalable trace collection, we contribute an approach that provides near constant-size communication traces regardless of the number of nodes while preserving structural information. We introduce intra- and inter-node compression techniques of MPI events and present results of our implementation for BlueGene/L. Given this novel capability, we discuss its impact on communicat...
Scalability to large number of processes is one of the weaknesses of current MPI implementations. St...
This paper presents an optimization of MPI communications, called CoMPI, based on run-time compressi...
A powerful and widely-used method for analyzing the performance behavior of parallel programs is eve...
Mueller). Characterizing the communication behavior of large-scale applications is a difficult and c...
Characterizing the communication behavior of large-scale applications is a difficult and costly task...
This paper presents a portable optimization for MPI communications, called PRAcTICaL-MPI (Portable A...
Event tracing of applications under dynamic execution is crucial for performance modeling, optimizat...
The performance of massively parallel program is often impacted by the cost of communication across ...
This paper presents an optimization of MPI communication, called Adaptive-CoMPI, based on runtime co...
Applications must scale well to make efficient use of even medium-scale parallel systems. Because sc...
The thesis presents a contribution to the analysis and visualization of computational performance ba...
This thesis offers a novel framework for representing groups and communicators in Message Passing In...
AbstractPerformance analysis of scientific parallel applications is essential to use High Performanc...
The process of obtaining useful message passing applications tracefiles for performance analysis in ...
Performance analysis is an essential part of the development process of HPC applications. Thus, deve...
Scalability to large number of processes is one of the weaknesses of current MPI implementations. St...
This paper presents an optimization of MPI communications, called CoMPI, based on run-time compressi...
A powerful and widely-used method for analyzing the performance behavior of parallel programs is eve...
Mueller). Characterizing the communication behavior of large-scale applications is a difficult and c...
Characterizing the communication behavior of large-scale applications is a difficult and costly task...
This paper presents a portable optimization for MPI communications, called PRAcTICaL-MPI (Portable A...
Event tracing of applications under dynamic execution is crucial for performance modeling, optimizat...
The performance of massively parallel program is often impacted by the cost of communication across ...
This paper presents an optimization of MPI communication, called Adaptive-CoMPI, based on runtime co...
Applications must scale well to make efficient use of even medium-scale parallel systems. Because sc...
The thesis presents a contribution to the analysis and visualization of computational performance ba...
This thesis offers a novel framework for representing groups and communicators in Message Passing In...
AbstractPerformance analysis of scientific parallel applications is essential to use High Performanc...
The process of obtaining useful message passing applications tracefiles for performance analysis in ...
Performance analysis is an essential part of the development process of HPC applications. Thus, deve...
Scalability to large number of processes is one of the weaknesses of current MPI implementations. St...
This paper presents an optimization of MPI communications, called CoMPI, based on run-time compressi...
A powerful and widely-used method for analyzing the performance behavior of parallel programs is eve...