Store-queue-free architectures remove the store queue and use memory cloaking to communicate in-flight stores instead. In these architectures, frequent mispredictions may occur when the store to load dependencies are inconsistent. We present DMDP (Dynamic Memory Dependence Predication) which modifies the microarchitecture behavior for such loads to mitigate memory dependence mispredictions. When a given dependence is hard to predict, i.e., a given load occasionally depends on a particular store, but it is independent at other times, DMDP predicates the load so that the address of the load is compared with the address of the predicted store to compute a predicate. This predicate guides the load to obtain the value from either the cache or th...
Historically, energy constrained devices (ECDs) have favored simple in-order pipelines over out-of-o...
The ever-increasing computational power of contemporary microprocessors reduces the execution time s...
The ever-increasing computational power of contemporary microprocessors reduces the execution time s...
International audienceMemory Dependency Prediction (MDP) is paramount to good out-of-order performan...
We consider a variety of dynamic, hardware-based methods for exploiting load/store parallelism, incl...
As the existing techniques that empower the modern high-performance processors are being refined and...
An efficient mechanism to track and enforce memory dependences is crucial to an out-of-order micropr...
With the help of the memory dependence predic-tor the instruction scheduler can speculatively issue ...
In high-end processors, increasing the number of in-flight instructions can improve performance by o...
Modern out-of-order processor architectures focus significantly on the high performance execution of...
Future multi-core and many-core processors are likely to contain one or more high performance out-of...
Data cache misses reduce the performance of wide-issue processors by stalling the data supply to the...
Various memory consistency model implementations (e.g., x86, SPARC) willfully allow a core to see it...
Memory dependence prediction allows out-of-order issue processors to achieve high degrees of instruc...
Memory dependence prediction allows out-of-order is-sue processors to achieve high degrees of instru...
Historically, energy constrained devices (ECDs) have favored simple in-order pipelines over out-of-o...
The ever-increasing computational power of contemporary microprocessors reduces the execution time s...
The ever-increasing computational power of contemporary microprocessors reduces the execution time s...
International audienceMemory Dependency Prediction (MDP) is paramount to good out-of-order performan...
We consider a variety of dynamic, hardware-based methods for exploiting load/store parallelism, incl...
As the existing techniques that empower the modern high-performance processors are being refined and...
An efficient mechanism to track and enforce memory dependences is crucial to an out-of-order micropr...
With the help of the memory dependence predic-tor the instruction scheduler can speculatively issue ...
In high-end processors, increasing the number of in-flight instructions can improve performance by o...
Modern out-of-order processor architectures focus significantly on the high performance execution of...
Future multi-core and many-core processors are likely to contain one or more high performance out-of...
Data cache misses reduce the performance of wide-issue processors by stalling the data supply to the...
Various memory consistency model implementations (e.g., x86, SPARC) willfully allow a core to see it...
Memory dependence prediction allows out-of-order issue processors to achieve high degrees of instruc...
Memory dependence prediction allows out-of-order is-sue processors to achieve high degrees of instru...
Historically, energy constrained devices (ECDs) have favored simple in-order pipelines over out-of-o...
The ever-increasing computational power of contemporary microprocessors reduces the execution time s...
The ever-increasing computational power of contemporary microprocessors reduces the execution time s...