Many experimental performance evaluations depend on accurate measurements of the cost of executing a piece of code. Often these measurements are conducted using infras-tructures to access hardware performance counters. Most modern processors provide such counters to count micro-architectural events such as retired instructions or clock cycles. These counters can be difficult to configure, may not be programmable or readable from user-level code, and can not discriminate between events caused by different software threads. Various software infrastructures address this problem, providing access to per-thread counters from application code. This paper constitutes the first comparative study of the accuracy of three commonly used measurement in...
Modern processors incorporate several performance monitoring units, which can be used to count event...
Cycles-Per-Instruction (CPI) stacks provide intuitive and insightful performance information to soft...
We introduce the usage of hardware performance counters (HPCs) as a new method that allows very prec...
Performance analysis is an essential step for better software optimization, which is critical for em...
A common way of representing processor performance is to use Cycles per Instruction (CPI) `stacks' w...
Cycles per Instruction (CPI) stacks break down processor execution time into a baseline CPI plus a n...
High performance computing is playing an increasingly important role in the scientific community. As...
We present accurate, low-level measurements of process preemption, interrupt handling and memory sys...
International audienceHardware performance monitoring counters have recently received a lot of atten...
CPU clock frequency is not likely to be increased significantly in the coming years, and data analys...
When creating architectural tools, it is essential to know whether the generated results make sense....
The purpose of the PAPI project is to specify a standard application programming interface (API) for...
Performance observability is the ability to accurately capture, analyze, and present (collectively o...
We present accurate, low-level measurements of process preemption, interrupt handling and memory sys...
Memory contention is one of the largest sources of inter-core interference in statically partitioned...
Modern processors incorporate several performance monitoring units, which can be used to count event...
Cycles-Per-Instruction (CPI) stacks provide intuitive and insightful performance information to soft...
We introduce the usage of hardware performance counters (HPCs) as a new method that allows very prec...
Performance analysis is an essential step for better software optimization, which is critical for em...
A common way of representing processor performance is to use Cycles per Instruction (CPI) `stacks' w...
Cycles per Instruction (CPI) stacks break down processor execution time into a baseline CPI plus a n...
High performance computing is playing an increasingly important role in the scientific community. As...
We present accurate, low-level measurements of process preemption, interrupt handling and memory sys...
International audienceHardware performance monitoring counters have recently received a lot of atten...
CPU clock frequency is not likely to be increased significantly in the coming years, and data analys...
When creating architectural tools, it is essential to know whether the generated results make sense....
The purpose of the PAPI project is to specify a standard application programming interface (API) for...
Performance observability is the ability to accurately capture, analyze, and present (collectively o...
We present accurate, low-level measurements of process preemption, interrupt handling and memory sys...
Memory contention is one of the largest sources of inter-core interference in statically partitioned...
Modern processors incorporate several performance monitoring units, which can be used to count event...
Cycles-Per-Instruction (CPI) stacks provide intuitive and insightful performance information to soft...
We introduce the usage of hardware performance counters (HPCs) as a new method that allows very prec...