memory disambiguation, load-forwarding, speculation The superscalar processor must issue instructions as early as possible for instruction-level parallelism. But load instructions would be issued with memory dependencies are known. Memory dependencies cannot be resolved prior to execution; therefore load instructions would be issued with prior store effective address calculated. This paper introduces the load forwarding history table (LFHT) for speculative executed load instructions by using load-forwarding behavior. This mechanism can be used to permit load instructions to be speculative executed without wait for prior store effective address calculated. The LFHT might provide about 7 % average speedup up over baseline architecture in our ...
L2 misses are one of the main causes for stalling the activity in current and future microprocessors...
Two orthogonal hardware techniques, table-based address prediction and early address calculation, fo...
One major restriction to the performance of out-of-order superscalar processors is the latency of lo...
In high-end processors, increasing the number of in-flight instructions can improve performance by o...
Conventional superscalar processors usually contain large CAM-based LSQ (load/store queue) with poor...
. Data speculation refers to the execution of an instruction before some logically preceding instruc...
Because they are based on large content-addressable memories, load-store queues (LSQ) present implem...
Processor performance is directly impacted by the latency of the memory system. As processor core cy...
By exploiting ne grain parallelism, superscalar processors can potentially increase the performance ...
Various memory consistency model implementations (e.g., x86, SPARC) willfully allow a core to see it...
By exploiting fine grain parallelism, superscalar processors can potentially increase the performanc...
One of the main performance bottlenecks of processors today is the discrepancy between processor and...
With the help of the memory dependence predictor the instruction scheduler can speculatively issue l...
The increase in the latencies of memory operations can be attributed to the increasing disparity bet...
Register promotion is an optimization that allocates a value to a register for a region of its lifet...
L2 misses are one of the main causes for stalling the activity in current and future microprocessors...
Two orthogonal hardware techniques, table-based address prediction and early address calculation, fo...
One major restriction to the performance of out-of-order superscalar processors is the latency of lo...
In high-end processors, increasing the number of in-flight instructions can improve performance by o...
Conventional superscalar processors usually contain large CAM-based LSQ (load/store queue) with poor...
. Data speculation refers to the execution of an instruction before some logically preceding instruc...
Because they are based on large content-addressable memories, load-store queues (LSQ) present implem...
Processor performance is directly impacted by the latency of the memory system. As processor core cy...
By exploiting ne grain parallelism, superscalar processors can potentially increase the performance ...
Various memory consistency model implementations (e.g., x86, SPARC) willfully allow a core to see it...
By exploiting fine grain parallelism, superscalar processors can potentially increase the performanc...
One of the main performance bottlenecks of processors today is the discrepancy between processor and...
With the help of the memory dependence predictor the instruction scheduler can speculatively issue l...
The increase in the latencies of memory operations can be attributed to the increasing disparity bet...
Register promotion is an optimization that allocates a value to a register for a region of its lifet...
L2 misses are one of the main causes for stalling the activity in current and future microprocessors...
Two orthogonal hardware techniques, table-based address prediction and early address calculation, fo...
One major restriction to the performance of out-of-order superscalar processors is the latency of lo...