Instruction cache performance is widely recognized as a critical component of the overall performance of a program; especially so in the case of large applications like database servers. In this report, we present a technique for (1) identifying repeated blocks of instructions in a program executable, and (2) converting these repeated code blocks into lightweight procedures (i.e. LWprocs). The use of LWprocs reduces the static code size of a program, and can potentially reduce the working set size of the process, at the cost of increasing its dynamic instruction count. However, the tradeoff seems to be in favor of the reduction in working set size for most programs. Even with a simple model of program structure and a straightforwa...
Instruction cache performance is critical to instruction fetch efficiency and overall processor perf...
International audienceEstimating worst-case execution times (WCETs) for architectures with caches re...
Journal PaperCurrent microprocessors incorporate techniques to exploit instruction-level parallelism...
This paper evaluates techniques that attempt to overcome these problems for instruction cache perfor...
Cache performance has become a very crucial factor in the overall system performance of machines. Ef...
Instruction-cache misses account for up to 40%; of execution time in online transaction processing (...
We explore the use of compiler optimizations, which optimize the layout of instructions in memory. T...
Instruction cache performance is very important for the overall performance of a computer. The place...
In this paper we present a straightforward technique for compressing the instruction stream for prog...
The processor speeds continue to improve at a faster rate than the memory access times. The issue of...
High instruction fetch bandwidth is essential for high performance in today’s wide-issue outof-order...
In this paper we propose a technique that uses an ad-ditional mini cache located between the I-Cache...
High instruction fetch bandwidth is essential for high performance in today’s wide-issue out-of-orde...
International audienceUsual cache optimisation techniques for high performance computing are difficu...
The performance of instruction memory is a critical factor for both large, high performance applicat...
Instruction cache performance is critical to instruction fetch efficiency and overall processor perf...
International audienceEstimating worst-case execution times (WCETs) for architectures with caches re...
Journal PaperCurrent microprocessors incorporate techniques to exploit instruction-level parallelism...
This paper evaluates techniques that attempt to overcome these problems for instruction cache perfor...
Cache performance has become a very crucial factor in the overall system performance of machines. Ef...
Instruction-cache misses account for up to 40%; of execution time in online transaction processing (...
We explore the use of compiler optimizations, which optimize the layout of instructions in memory. T...
Instruction cache performance is very important for the overall performance of a computer. The place...
In this paper we present a straightforward technique for compressing the instruction stream for prog...
The processor speeds continue to improve at a faster rate than the memory access times. The issue of...
High instruction fetch bandwidth is essential for high performance in today’s wide-issue outof-order...
In this paper we propose a technique that uses an ad-ditional mini cache located between the I-Cache...
High instruction fetch bandwidth is essential for high performance in today’s wide-issue out-of-orde...
International audienceUsual cache optimisation techniques for high performance computing are difficu...
The performance of instruction memory is a critical factor for both large, high performance applicat...
Instruction cache performance is critical to instruction fetch efficiency and overall processor perf...
International audienceEstimating worst-case execution times (WCETs) for architectures with caches re...
Journal PaperCurrent microprocessors incorporate techniques to exploit instruction-level parallelism...