Run-time profiling of executable binaries can offer valuable insight into the performance characteristics and behaviour of a program. Some methods, such as instrumentation, are invasive and involve modifications of the profiled binary. This can significantly impact performance, to the point that an instrumented binary runs many times slower than the original. The Performance Monitoring Unit found in many modern processors offers the possibility of low-overhead profiling through a plethora of performance events. In this report, we investigate and quantify this overhead for a variety of tests and configurations, using the “perf” tool of the Linux kernel. Results for four main usage modes of the PMU are included: counting, sampling, PEBS event...
A fundamental part of developing software is to understand what the application spends time on. This...
Memory contention is one of the largest sources of inter-core interference in statically partitioned...
Abstract. Performance profiling generates measurement overhead during parallel program execution. Me...
For industrial systems performance, it is desired to keep the IT infrastructure competitive through ...
Modern processors incorporate several performance monitoring units, which can be used to count event...
CPU clock frequency is not likely to be increased significantly in the coming years, and data analys...
We present accurate, low-level measurements of process preemption, interrupt handling and memory sys...
Performance analysis is an essential step for better software optimization, which is critical for em...
Abstract. Performance profiling of MPI programs generates overhead during execution that introduces ...
We introduce the usage of hardware performance counters (HPCs) as a new method that allows very prec...
A program profile attributes run-time costs to portions of a program's execution. Most profiling sys...
We present accurate, low-level measurements of process preemption, interrupt handling and memory sys...
This paper presents an automatic counter instrumentation and pro ling module added to the MPI librar...
Hardware performance monitoring counters (PMCs) have proven effective in characterizing application ...
Today, modern processors are equipped with a special unit named PMU that enables software developers...
A fundamental part of developing software is to understand what the application spends time on. This...
Memory contention is one of the largest sources of inter-core interference in statically partitioned...
Abstract. Performance profiling generates measurement overhead during parallel program execution. Me...
For industrial systems performance, it is desired to keep the IT infrastructure competitive through ...
Modern processors incorporate several performance monitoring units, which can be used to count event...
CPU clock frequency is not likely to be increased significantly in the coming years, and data analys...
We present accurate, low-level measurements of process preemption, interrupt handling and memory sys...
Performance analysis is an essential step for better software optimization, which is critical for em...
Abstract. Performance profiling of MPI programs generates overhead during execution that introduces ...
We introduce the usage of hardware performance counters (HPCs) as a new method that allows very prec...
A program profile attributes run-time costs to portions of a program's execution. Most profiling sys...
We present accurate, low-level measurements of process preemption, interrupt handling and memory sys...
This paper presents an automatic counter instrumentation and pro ling module added to the MPI librar...
Hardware performance monitoring counters (PMCs) have proven effective in characterizing application ...
Today, modern processors are equipped with a special unit named PMU that enables software developers...
A fundamental part of developing software is to understand what the application spends time on. This...
Memory contention is one of the largest sources of inter-core interference in statically partitioned...
Abstract. Performance profiling generates measurement overhead during parallel program execution. Me...