In high-end processors, increasing the number of in-flight instructions can improve performance by overlapping useful processing with long-latency accesses to the main memory. Buffering these instructions requires a tremendous amount of microarchitectural resources. Unfortunately, large structures negatively impact processor clock speed and energy efficiency. Thus, innovations in effective and efficient utilization of these resources are needed. In this paper, we target the load-store queue, a dynamic memory disambiguation logic that is among the least scalable structures in a modern microprocessor. We propose to use software assistance to identify load instructions that are guaranteed not to overlap with earlier pending stores and prevent ...
New trends such as the internet-of-things and smart homes push the demands for energy-efficiency. Ch...
The load-store queue (LQ-SQ) of modem superscalar processors is responsible for keeping the order of...
One of the main performance bottlenecks of processors today is the discrepancy between processor and...
One of the main challenges of modern processor designs is the implementation of scalable and efficie...
One of the main challenges of modern processor designs is the implementation of scalable and efficie...
Because they are based on large content-addressable memories, load-store queues (LSQs) present imple...
International audienceMemory disambiguation mechanisms, coupled with load/store queues in out-of-ord...
memory disambiguation, load-forwarding, speculation The superscalar processor must issue instruction...
An efficient mechanism to track and enforce memory dependences is crucial to an out-of-order micropr...
To alleviate the memory wall problem, current architectural trends suggest implementing large instru...
This paper describes several methods for improving the scalability of memory disambiguation hardware...
Modern out-of-order processor architectures focus significantly on the high performance execution of...
The increase in the latencies of memory operations can be attributed to the increasing disparity bet...
Modern out-of-order processors tolerate long latency memory operations by supporting a large number ...
To alleviate the memory wall problem, current architec-tural trends suggest implementing large instr...
New trends such as the internet-of-things and smart homes push the demands for energy-efficiency. Ch...
The load-store queue (LQ-SQ) of modem superscalar processors is responsible for keeping the order of...
One of the main performance bottlenecks of processors today is the discrepancy between processor and...
One of the main challenges of modern processor designs is the implementation of scalable and efficie...
One of the main challenges of modern processor designs is the implementation of scalable and efficie...
Because they are based on large content-addressable memories, load-store queues (LSQs) present imple...
International audienceMemory disambiguation mechanisms, coupled with load/store queues in out-of-ord...
memory disambiguation, load-forwarding, speculation The superscalar processor must issue instruction...
An efficient mechanism to track and enforce memory dependences is crucial to an out-of-order micropr...
To alleviate the memory wall problem, current architectural trends suggest implementing large instru...
This paper describes several methods for improving the scalability of memory disambiguation hardware...
Modern out-of-order processor architectures focus significantly on the high performance execution of...
The increase in the latencies of memory operations can be attributed to the increasing disparity bet...
Modern out-of-order processors tolerate long latency memory operations by supporting a large number ...
To alleviate the memory wall problem, current architec-tural trends suggest implementing large instr...
New trends such as the internet-of-things and smart homes push the demands for energy-efficiency. Ch...
The load-store queue (LQ-SQ) of modem superscalar processors is responsible for keeping the order of...
One of the main performance bottlenecks of processors today is the discrepancy between processor and...