Instruction Reuse is a microarchitectural technique that exploits dynamic instruction repetition to remove redundant computations at run-time. In this paper we examine instruction reuse of integer ALU and load instructions in network processing applications and attempt to answer the following questions:(1) How much of instruction repetition can be reused in packet processing applications?, (2) Can the temporal locality of network traffic be exploited to reduce interference in the Reuse Buffer and improve reuse? and (3) What is the effect of reuse on microarchitectural features such as resource contention and memory accesses? We use an execution driven simulation methodology to evaluate instruction reuse and find that for the benchmarks cons...
Instruction scheduling and Software pipelining are important compilation techniques which reorder in...
Pipelined microprocessors allow the simultaneous execution of several machine instructions at a time...
Modern network processors such as the Intel IXP family hide the latency of slow instructions by supp...
Instruction reuse is a microarchitectural technique that improves the execution time of a program by...
Trace-level reuse is based on the observation that some traces (dynamic sequences of instructions) a...
Superscalar microprocessors currently power the majority of computing machines. These processors ar...
Processors that can simultaneously execute multiple paths of execution will only exacerbate the fetc...
This paper presents a study of the performance limits of data value reuse. Two types of data value r...
The fact that instructions in programs often produce repetitive results has motivated researchers to...
The fact that instructions in programs often produce repetitive results has motivated researchers to...
Value locality is the phenomenon that a small number of values occur repeatedly in the same register...
Modern network processors support high levels of parallelism in packet processing by supporting mult...
Abstract- Instruction-level redundancy is an effective scheme to reduce the susceptibility of microp...
As technology trends yield shorter cycle times and larger, wider datapaths in architectures for mult...
To maximize the performance of wide-issue superscalar out-of-order microprocessors, the issue stage ...
Instruction scheduling and Software pipelining are important compilation techniques which reorder in...
Pipelined microprocessors allow the simultaneous execution of several machine instructions at a time...
Modern network processors such as the Intel IXP family hide the latency of slow instructions by supp...
Instruction reuse is a microarchitectural technique that improves the execution time of a program by...
Trace-level reuse is based on the observation that some traces (dynamic sequences of instructions) a...
Superscalar microprocessors currently power the majority of computing machines. These processors ar...
Processors that can simultaneously execute multiple paths of execution will only exacerbate the fetc...
This paper presents a study of the performance limits of data value reuse. Two types of data value r...
The fact that instructions in programs often produce repetitive results has motivated researchers to...
The fact that instructions in programs often produce repetitive results has motivated researchers to...
Value locality is the phenomenon that a small number of values occur repeatedly in the same register...
Modern network processors support high levels of parallelism in packet processing by supporting mult...
Abstract- Instruction-level redundancy is an effective scheme to reduce the susceptibility of microp...
As technology trends yield shorter cycle times and larger, wider datapaths in architectures for mult...
To maximize the performance of wide-issue superscalar out-of-order microprocessors, the issue stage ...
Instruction scheduling and Software pipelining are important compilation techniques which reorder in...
Pipelined microprocessors allow the simultaneous execution of several machine instructions at a time...
Modern network processors such as the Intel IXP family hide the latency of slow instructions by supp...