A processor executes the full dynamic instruction stream in order to compute the final output of a program, yet we observe equivalent, smaller instruction streams that produce the same cor-rect output. Based on this observation, we attempt to identify large, dynamically-contiguous regions of instructions that are ineffectual as a whole: they either contain no writes, writes that are never referenced, or writes that do not modify the value of a location. The architectural impli-cation is that instruction fetch/execution can quickly bypass predicted-ineffectual regions, while another thread of control verifies that the implied branch predictions in the region are correct and that the region is truly ineffectual. 1
Though current general-purpose processors have several small CPU cores as opposed to a single more c...
Instructions executed by the processor are dynamically dead if the values they produce are not used ...
In a dynamic reordering superscalar processor, the front-end fetches instructions and places them in...
At present there exist three main schools of thought for improving single-threaded program performan...
Fetch engine performance is seriously limited by the branch prediction table access latency. This fa...
Superscalar microprocessors currently power the majority of computing machines. These processors ar...
Current processors exploit out-of-order execution and branch prediction to improve instruction level...
For many applications, branch mispredictions and cache misses limit a processor’s performance to a l...
Our goal is to dramatically increase the performance of uniprocessors through the exploitation of in...
We observe a non-negligible fraction---3 to 16% in our benchmarks ---of dynamically dead instruction...
In the modern era of wire-dominated architectures, specific effort must be made to reduce needless c...
Instruction traces are useful tools for studying many aspects of computer systems, but they are diff...
The fact that instructions in programs often produce repetitive results has motivated researchers to...
To maintain a reasonable level of complexity, processor implementations contain Serializing Instruct...
Value prediction attempts to eliminate true-data dependencies by dynamically predicting the outcome ...
Though current general-purpose processors have several small CPU cores as opposed to a single more c...
Instructions executed by the processor are dynamically dead if the values they produce are not used ...
In a dynamic reordering superscalar processor, the front-end fetches instructions and places them in...
At present there exist three main schools of thought for improving single-threaded program performan...
Fetch engine performance is seriously limited by the branch prediction table access latency. This fa...
Superscalar microprocessors currently power the majority of computing machines. These processors ar...
Current processors exploit out-of-order execution and branch prediction to improve instruction level...
For many applications, branch mispredictions and cache misses limit a processor’s performance to a l...
Our goal is to dramatically increase the performance of uniprocessors through the exploitation of in...
We observe a non-negligible fraction---3 to 16% in our benchmarks ---of dynamically dead instruction...
In the modern era of wire-dominated architectures, specific effort must be made to reduce needless c...
Instruction traces are useful tools for studying many aspects of computer systems, but they are diff...
The fact that instructions in programs often produce repetitive results has motivated researchers to...
To maintain a reasonable level of complexity, processor implementations contain Serializing Instruct...
Value prediction attempts to eliminate true-data dependencies by dynamically predicting the outcome ...
Though current general-purpose processors have several small CPU cores as opposed to a single more c...
Instructions executed by the processor are dynamically dead if the values they produce are not used ...
In a dynamic reordering superscalar processor, the front-end fetches instructions and places them in...