Address correlation is a technique that links the addresses that reference the same data values. Using a detailed source-code level analysis, a recent study [1] revealed that different addresses containing the same data can often be correlated at run-time to eliminate on-chip data cache misses. In this paper, we study the upper-bound performance of an Address Correlation System (ACS), and discuss specific optimizations for a realistic hardware implementation. An ACS\u27can effectively eliminate most of the LI data cache misses by supplying the data from a correlated address already found in the cache to thereby improve the performance of the processor. For 10 of the SPEC CPU2000 benchmarks, 57 to 99% of all LI data cache load misses can be ...
Because of the infeasibility or expense of large fully-associative caches, cache memories are often ...
Cache memory is one of the most important components of a computer system. The cache allows quickly...
In this correspondence, we propose design techniques that may significantly simplify the cache acces...
Address correlation is a technique that links the addresses that reference the same data values. Usi...
The most important processor performance bottleneck is the ever-increasing gap between the memory an...
On-chip caches to reduce average memory access latency are commonplace in today\u27s commercial micr...
Hard-to-predict branches depending on long-latency cache-misses have been recognized as a major perf...
During the last two decades, the performance of CPU has been developed much faster than that of memo...
To maximize the benefit and minimize the overhead of software-based latency tolerance techniques, we...
Processor performance is directly impacted by the latency of the memory system. As processor core cy...
While runahead execution is effective at parallelizing independent long-latency cache misses, it is ...
As CPU data requests to the level-one (L1) data cache (DC) can represent as much as 25 % of an embed...
As CPU data requests to the level-one (L1) data cache (DC) can represent as much as 25% of an embedd...
Application performance on modern microprocessors depends heavily on performance related characteris...
Data or instructions that are regularly used are saved in cache so that it is very easy to retrieve ...
Because of the infeasibility or expense of large fully-associative caches, cache memories are often ...
Cache memory is one of the most important components of a computer system. The cache allows quickly...
In this correspondence, we propose design techniques that may significantly simplify the cache acces...
Address correlation is a technique that links the addresses that reference the same data values. Usi...
The most important processor performance bottleneck is the ever-increasing gap between the memory an...
On-chip caches to reduce average memory access latency are commonplace in today\u27s commercial micr...
Hard-to-predict branches depending on long-latency cache-misses have been recognized as a major perf...
During the last two decades, the performance of CPU has been developed much faster than that of memo...
To maximize the benefit and minimize the overhead of software-based latency tolerance techniques, we...
Processor performance is directly impacted by the latency of the memory system. As processor core cy...
While runahead execution is effective at parallelizing independent long-latency cache misses, it is ...
As CPU data requests to the level-one (L1) data cache (DC) can represent as much as 25 % of an embed...
As CPU data requests to the level-one (L1) data cache (DC) can represent as much as 25% of an embedd...
Application performance on modern microprocessors depends heavily on performance related characteris...
Data or instructions that are regularly used are saved in cache so that it is very easy to retrieve ...
Because of the infeasibility or expense of large fully-associative caches, cache memories are often ...
Cache memory is one of the most important components of a computer system. The cache allows quickly...
In this correspondence, we propose design techniques that may significantly simplify the cache acces...