Call graph profiling reports measurements of resource utilization along with information about the calling context in which the resources were consumed. We present the design of a novel profiler that measures resource utilization and its associated calling context using a stack sampling technique. Our scheme has a novel combination of features and mechanisms. First, it requires no compiler support or instrumentation, either of source or binary code. Second, it works on heavily optimized code and on complex, multi-module applications. Third, it uses sampling rather than tracing to build a context tree, collect histogram data, and to characterize calling patterns. Fourth, the data structures and algorithms are efficient enough to construct th...