Scaling a parallel program to modern supercomputers is challenging due to inter-process communication, Amdahl's law, and resource contention. Performance analysis tools for finding such scaling bottlenecks either base on profiling or tracing. Profiling incurs low overheads but does not capture detailed dependencies needed for root-cause analysis. Tracing collects all information at prohibitive overheads. In this work, we design SCALANA that uses static analysis techniques to achieve the best of both worlds - it enables the analyzability of traces at a cost similar to profiling. SCALANA first leverages static compiler techniques to build a Program Structure Graph, which records the main computation and communication patterns as well as the p...
Concurrency levels in large-scale supercomputers are rising exponentially, and shared-memory nodes w...
This paper presents scalability as a basis for profiling and performance debugging of parallel progr...
Concurrency levels in large-scale, distributed-memory supercomputers are rising exponentially. Moder...
There are few runtime tools for modestly sized computing systems, with 10^3 processors, and above th...
Performance analysis tools are an important component of the parallel program development and tuning...
Tracing and performance analysis tools are an important component in the development of high perform...
Abstract—Applications must scale well to make efficient use of today’s class of petascale computers,...
Developing correct and efficient software for large scale systems is a challenging task. Developers ...
Tracing and performance analysis tools are an important component in the development of high perform...
International audienceTo efficiently exploit the resources of new many-core architectures, integrati...
Applications must scale well to make efficient use of even medium-scale parallel systems. Because sc...
Scalasca is a performance analysis tool, which parses the trace of an application run for certain pa...
Event traces are required to correctly diagnose a number of performance problems that arise on today...
It is easy to find errors and inefficient parts of a sequential program, by using a standard debugge...
Abstract. Performance analysis tools are an important component of the parallel program development ...
Concurrency levels in large-scale supercomputers are rising exponentially, and shared-memory nodes w...
This paper presents scalability as a basis for profiling and performance debugging of parallel progr...
Concurrency levels in large-scale, distributed-memory supercomputers are rising exponentially. Moder...
There are few runtime tools for modestly sized computing systems, with 10^3 processors, and above th...
Performance analysis tools are an important component of the parallel program development and tuning...
Tracing and performance analysis tools are an important component in the development of high perform...
Abstract—Applications must scale well to make efficient use of today’s class of petascale computers,...
Developing correct and efficient software for large scale systems is a challenging task. Developers ...
Tracing and performance analysis tools are an important component in the development of high perform...
International audienceTo efficiently exploit the resources of new many-core architectures, integrati...
Applications must scale well to make efficient use of even medium-scale parallel systems. Because sc...
Scalasca is a performance analysis tool, which parses the trace of an application run for certain pa...
Event traces are required to correctly diagnose a number of performance problems that arise on today...
It is easy to find errors and inefficient parts of a sequential program, by using a standard debugge...
Abstract. Performance analysis tools are an important component of the parallel program development ...
Concurrency levels in large-scale supercomputers are rising exponentially, and shared-memory nodes w...
This paper presents scalability as a basis for profiling and performance debugging of parallel progr...
Concurrency levels in large-scale, distributed-memory supercomputers are rising exponentially. Moder...