An efficient mechanism to track and enforce memory dependences is crucial to an out-of-order microprocessor. The conventional approach of using cross-checked load queue and store queue, while very effective in earlier processor incarnations, suffers from scalability problems in modern high-frequency designs that rely on buffering many in-flight instructions to exploit instruction-level parallelism. In this paper, we make a case for a very different approach to dynamic memory disambiguation. We move away from the conventional exact disambiguation strategy and adopt an opportunistic method: we allow loads and stores to access an L0 cache as they are issued out of program order, hoping that with such a laissez-faire approach, most loads actual...
Store misses cause significant delays in shared-memory multiprocessors because of limited store buff...
Modern out-of-order processors tolerate long latency memory operations by supporting a large number ...
The memory consistency model of a shared-memory multiprocessor determines the extent to which memory...
With the help of the memory dependence predictor the instruction scheduler can speculatively issue l...
One of the main challenges of modern processor designs is the implementation of scalable and efficie...
In high-end processors, increasing the number of in-flight instructions can improve performance by o...
Modern out-of-order processor architectures focus significantly on the high performance execution of...
Thesis (Ph. D.)--University of Rochester. Dept. of Electrical Engineering, 2008.Continued scaling of...
One of the main challenges of modern processor designs is the implementation of scalable and efficie...
Store-queue-free architectures remove the store queue and use memory cloaking to communicate in-flig...
Because they are based on large content-addressable memories, load-store queues (LSQs) present imple...
Various memory consistency model implementations (e.g., x86, SPARC) willfully allow a core to see it...
International audienceConcurrent programs running on weak memory models exhibit re-laxed behaviours,...
Writing concurrent programs with shared memory is often not trivial. Correctly synchronising the thr...
In this paper, we develop the first feasibly implementable scheme for end-to-end dynamic verificatio...
Store misses cause significant delays in shared-memory multiprocessors because of limited store buff...
Modern out-of-order processors tolerate long latency memory operations by supporting a large number ...
The memory consistency model of a shared-memory multiprocessor determines the extent to which memory...
With the help of the memory dependence predictor the instruction scheduler can speculatively issue l...
One of the main challenges of modern processor designs is the implementation of scalable and efficie...
In high-end processors, increasing the number of in-flight instructions can improve performance by o...
Modern out-of-order processor architectures focus significantly on the high performance execution of...
Thesis (Ph. D.)--University of Rochester. Dept. of Electrical Engineering, 2008.Continued scaling of...
One of the main challenges of modern processor designs is the implementation of scalable and efficie...
Store-queue-free architectures remove the store queue and use memory cloaking to communicate in-flig...
Because they are based on large content-addressable memories, load-store queues (LSQs) present imple...
Various memory consistency model implementations (e.g., x86, SPARC) willfully allow a core to see it...
International audienceConcurrent programs running on weak memory models exhibit re-laxed behaviours,...
Writing concurrent programs with shared memory is often not trivial. Correctly synchronising the thr...
In this paper, we develop the first feasibly implementable scheme for end-to-end dynamic verificatio...
Store misses cause significant delays in shared-memory multiprocessors because of limited store buff...
Modern out-of-order processors tolerate long latency memory operations by supporting a large number ...
The memory consistency model of a shared-memory multiprocessor determines the extent to which memory...