Dynamic instruction mixes form an important part of the toolkits of performance tuners, compiler writers, and CPU architects. Instruction mixes are traditionally generated using software instrumentation, an accurate yet slow method, that is normally limited to user-mode code. We present a new method for generating instruction mixes using the Performance Monitoring Unit (PMU) of the CPU. It has very low overhead, extends coverage to kernel-mode execution, and causes only a very modest decrease in accuracy, compared to software instrumentation. In order to achieve this level of accuracy, we develop a new PMU-based data collection method, Hybrid Basic Block Profiling (HBBP). HBBP uses simple machine learning techniques to choose, on a per basi...
The increasing density of VLSI circuits has motivated research into ways to utilize large area budge...
Current microprocessors improve performance by exploiting instruction-level parallelism (ILP). ILP h...
ABSTRACT: Debugging and profiling tools can alter the execution flow or timing, can induce heisenbug...
A fundamental part of developing software is to understand what the application spends time on. This...
All high-performance production JVMs employ an adaptive strategy for program execution. Methods are ...
AbstractBasic block vectorization consists in extracting instruction level parallelism inside basic ...
International audienceModern hardware features can boost the performance of an application, but soft...
Abstract Profile-based optimizations can be used for instruction scheduling, loop scheduling, data p...
PhD ThesisCurrent microprocessors improve performance by exploiting instruction-level parallelism (I...
perform statistical sampling by tak-ing periodic snapshots of a program’s state. Statistical samplin...
Tuning a compiler so that it produces optimised code is a difficult task because modern processors ...
The end of chip frequency scaling capacity, due heat dissipation limitations, made manufacturers sea...
The development of modern pipelined and multiple functional unit processors has increased the availa...
Profile-based optimizations can be used for instruction scheduling, loop scheduling, data preloading...
Run-time profiling of executable binaries can offer valuable insight into the performance characteri...
The increasing density of VLSI circuits has motivated research into ways to utilize large area budge...
Current microprocessors improve performance by exploiting instruction-level parallelism (ILP). ILP h...
ABSTRACT: Debugging and profiling tools can alter the execution flow or timing, can induce heisenbug...
A fundamental part of developing software is to understand what the application spends time on. This...
All high-performance production JVMs employ an adaptive strategy for program execution. Methods are ...
AbstractBasic block vectorization consists in extracting instruction level parallelism inside basic ...
International audienceModern hardware features can boost the performance of an application, but soft...
Abstract Profile-based optimizations can be used for instruction scheduling, loop scheduling, data p...
PhD ThesisCurrent microprocessors improve performance by exploiting instruction-level parallelism (I...
perform statistical sampling by tak-ing periodic snapshots of a program’s state. Statistical samplin...
Tuning a compiler so that it produces optimised code is a difficult task because modern processors ...
The end of chip frequency scaling capacity, due heat dissipation limitations, made manufacturers sea...
The development of modern pipelined and multiple functional unit processors has increased the availa...
Profile-based optimizations can be used for instruction scheduling, loop scheduling, data preloading...
Run-time profiling of executable binaries can offer valuable insight into the performance characteri...
The increasing density of VLSI circuits has motivated research into ways to utilize large area budge...
Current microprocessors improve performance by exploiting instruction-level parallelism (ILP). ILP h...
ABSTRACT: Debugging and profiling tools can alter the execution flow or timing, can induce heisenbug...