Memory operations remain a significant bottleneck in dynamically scheduled pipelined processors, due in part to the inability to statically determine the existence of memory address dependencies. Hardware memory renaming techniques have been proposed which predict which stores a load might be dependent upon. These prediction techniques can be used to speculatively forward a value from a predicted store dependency to a load through a value prediction table; however, these techniques require large and time-consuming hardware tables. In this paper we propose a software-guided approach for identifying dependencies between store and load instructions and the Load Marking (LM) architecture to communicate these dependencies to the hardware. Compil...
Memory dependence prediction allows out-of-order issue processors to achieve high degrees of instruc...
Dependencies between instructions restrict the instruction-level parallelism, and make difficult for...
Register promotion is an optimization that allocates a value to a register for a region of its lifet...
As processors continue to exploit more instruction level parallelism, a greater demand is placed on ...
Abstract—An increasing cache latency in next-generation pro-cessors incurs profound performance impa...
As processors continue to exploit more instruction level parallelism, greater demands are placed on ...
memory disambiguation, load-forwarding, speculation The superscalar processor must issue instruction...
. Data speculation refers to the execution of an instruction before some logically preceding instruc...
In high-end processors, increasing the number of in-flight instructions can improve performance by o...
Memory latency is an important bottleneck in system performance that cannot be adequately solved by ...
Two orthogonal hardware techniques, table-based address prediction and early address calculation, fo...
Processor performance is directly impacted by the latency of the memory system. As processor core cy...
As the existing techniques that empower the modern high-performance processors are being refined and...
Data communications between producer instructions and consumer instructions through memory incur ext...
Research on computer memory systems has been of increasing importance over the last decade, as they ...
Memory dependence prediction allows out-of-order issue processors to achieve high degrees of instruc...
Dependencies between instructions restrict the instruction-level parallelism, and make difficult for...
Register promotion is an optimization that allocates a value to a register for a region of its lifet...
As processors continue to exploit more instruction level parallelism, a greater demand is placed on ...
Abstract—An increasing cache latency in next-generation pro-cessors incurs profound performance impa...
As processors continue to exploit more instruction level parallelism, greater demands are placed on ...
memory disambiguation, load-forwarding, speculation The superscalar processor must issue instruction...
. Data speculation refers to the execution of an instruction before some logically preceding instruc...
In high-end processors, increasing the number of in-flight instructions can improve performance by o...
Memory latency is an important bottleneck in system performance that cannot be adequately solved by ...
Two orthogonal hardware techniques, table-based address prediction and early address calculation, fo...
Processor performance is directly impacted by the latency of the memory system. As processor core cy...
As the existing techniques that empower the modern high-performance processors are being refined and...
Data communications between producer instructions and consumer instructions through memory incur ext...
Research on computer memory systems has been of increasing importance over the last decade, as they ...
Memory dependence prediction allows out-of-order issue processors to achieve high degrees of instruc...
Dependencies between instructions restrict the instruction-level parallelism, and make difficult for...
Register promotion is an optimization that allocates a value to a register for a region of its lifet...