Performance monitoring of HPC applications offers opportunities for adaptive optimization based on dynamic performance behavior, unavailable in purely post-mortem performance views. However, a parallel performance monitoring system must have low overhead and high efficiency to make these opportunities tangible. We describe a scalable parallel performance monitor called TAUoverMRNet (ToM), created from the integration of the TAU performance system and the Multicast Reduction Network (MRNet). The integration is achieved through a plug-in architecture in TAU that allows selection of different transport substrates to offload online performance data. A method to establish the transport overlay structure of the monitor from within TAU, one that r...
Concurrency levels in large-scale, distributed-memory supercomputers are rising exponentially. Moder...
Over the past 10 years we have seen the transition from single core computer to multicore computing,...
This tutorial presents state-of-the-art performance tools for leading-edge HPC systems founded on th...
Abstract. Online application performance monitoring allows tracking performance characteristics duri...
This paper introduces an infrastructure for efficiently collecting performance profiles from paralle...
Modern parallel systems and applications are constantly increasing in scale and complexity, and cons...
Abstract—Traditional performance analysis techniques are performed after a parallel program has comp...
Current large-scale HPC systems consist of complex configurations with a huge number of potentially ...
Parallel architectures, like the transputer-based multicomputer network, offer potentially enormous...
The purpose of this project was to build an extensible cross-platform infrastructure to facilitate t...
The growth of High Performance Computer (HPC) systems increases the complexity with respect to under...
High-performance computing systems have become increasingly dynamic, complex, and unpredictable. To ...
There is a variety of tools to measure the performance of Linux systems and the applications running...
The HPC service at CERN provides linux batch infrastructure to run high performance computing appli...
Large scale computer clusters have during the last years become dominant for making computations in ...
Concurrency levels in large-scale, distributed-memory supercomputers are rising exponentially. Moder...
Over the past 10 years we have seen the transition from single core computer to multicore computing,...
This tutorial presents state-of-the-art performance tools for leading-edge HPC systems founded on th...
Abstract. Online application performance monitoring allows tracking performance characteristics duri...
This paper introduces an infrastructure for efficiently collecting performance profiles from paralle...
Modern parallel systems and applications are constantly increasing in scale and complexity, and cons...
Abstract—Traditional performance analysis techniques are performed after a parallel program has comp...
Current large-scale HPC systems consist of complex configurations with a huge number of potentially ...
Parallel architectures, like the transputer-based multicomputer network, offer potentially enormous...
The purpose of this project was to build an extensible cross-platform infrastructure to facilitate t...
The growth of High Performance Computer (HPC) systems increases the complexity with respect to under...
High-performance computing systems have become increasingly dynamic, complex, and unpredictable. To ...
There is a variety of tools to measure the performance of Linux systems and the applications running...
The HPC service at CERN provides linux batch infrastructure to run high performance computing appli...
Large scale computer clusters have during the last years become dominant for making computations in ...
Concurrency levels in large-scale, distributed-memory supercomputers are rising exponentially. Moder...
Over the past 10 years we have seen the transition from single core computer to multicore computing,...
This tutorial presents state-of-the-art performance tools for leading-edge HPC systems founded on th...