Identifying performance bottlenecks and their associated calling contexts is critical for tuning high-performance applications. This thesis presents a new approach to measuring resource utilization and its calling context. Previous instrumentation-based approaches for reporting calling context introduce overhead proportional to the number of function calls performed. We describe a new design for a call path profiler based on stack sampling. Our design enables profiling of unmodified binaries, provides low and controllable overhead, and accurately attributes context-dependent costs of calls. We use a special trampoline function that improves the efficiency of stack sampling and enables the association of unique invocation counts with sampled...
Existing methods of for call graph profiling, such as that used by gprof, deal badly with programs ...
Calling context trees are one of the most fundamental data structures for representing the interproc...
The majority of existing application profiling techniques ag- gregate and report performance costs b...
Call graph profiling reports measurements of resource utilization along with information about the c...
Applications must scale well to make efficient use of even medium-scale parallel systems. Because sc...
Calling context profiling collects statistics separately for each calling context. Complete calling ...
AbstractCalling context profiling collects statistics separately for each calling context. Complete ...
Runtime call graph profilers, like gprof [16], are widely used as debugging tools to identify perfor...
Abstract—Applications must scale well to make efficient use of today’s class of petascale computers,...
Calling context trees (CCTs) associate performance metrics with paths through a program's call graph...
The correlation of performance bottlenecks and their associated source code has become a cornerstone...
Calling context profiling fulfills programmers’ information needs to obtain a complete picture of a ...
The majority of existing application profiling techniques ag-gregate and report performance costs by...
Existing methods of for call graph profiling, such as that used by gprof, deal badly with programs ...
Calling context trees are one of the most fundamental data structures for representing the interproc...
The majority of existing application profiling techniques ag- gregate and report performance costs b...
Call graph profiling reports measurements of resource utilization along with information about the c...
Applications must scale well to make efficient use of even medium-scale parallel systems. Because sc...
Calling context profiling collects statistics separately for each calling context. Complete calling ...
AbstractCalling context profiling collects statistics separately for each calling context. Complete ...
Runtime call graph profilers, like gprof [16], are widely used as debugging tools to identify perfor...
Abstract—Applications must scale well to make efficient use of today’s class of petascale computers,...
Calling context trees (CCTs) associate performance metrics with paths through a program's call graph...
The correlation of performance bottlenecks and their associated source code has become a cornerstone...
Calling context profiling fulfills programmers’ information needs to obtain a complete picture of a ...
The majority of existing application profiling techniques ag-gregate and report performance costs by...
Existing methods of for call graph profiling, such as that used by gprof, deal badly with programs ...
Calling context trees are one of the most fundamental data structures for representing the interproc...
The majority of existing application profiling techniques ag- gregate and report performance costs b...