A new conceptual cache, NRP (Non-Referenced Prefetch) cache, is proposed to improve the performance of instruction prefetch mechanisms which try to prefetch both the sequential and non-sequential blocks under the limited memory bandwidth. The NRP cache is used in storing prefetched blocks which were not referenced by the CPU, while these blocks were discarded in other previous prefetch mechanisms. By storing the nonreferenced prefetch blocks in the NRP cache, both cache misses and memory traffics are reduced. A prefetch method to prefetch both the sequential and the non-sequential instruction paths is designed to utilize the effectiveness of the NRP cache. The results from trace-driven simulation show that this approach provides an improvem...
In the last century great progress was achieved in developing processors with extremely high computa...
Ever increasing memory latencies and deeper pipelines push memory farther from the processor. Prefet...
Data prefetching is an effective technique to hide memory latency and thus bridge the increasing pro...
Instruction cache miss latency is becoming an increasingly important performance bottleneck, especia...
The full text of this article is not available on SOAR. WSU users can access the article via IEEE Xp...
Instruction cache misses can severely limit the performance of both superscalar processors and high ...
As the trends of process scaling make memory system even more crucial bottleneck, the importance of ...
Processor performance has increased far faster than memories have been able to keep up with, forcing...
grantor: University of TorontoThe latency of accessing instructions and data from the memo...
Data-intensive applications often exhibit memory referencing patterns with little data reuse, result...
A common mechanism to perform hardware-based prefetching for regular accesses to arrays and chained...
Conventional cache prefetching approaches can be either hardware-based, generally by using a one-blo...
Cache performance analysis is becoming increasingly important in microprocessor design. This work ex...
International audienceWhen designing a prefetcher, the computer architect has to define which event ...
Recent technological advances are such that the gap between processor cycle times and memory cycle t...
In the last century great progress was achieved in developing processors with extremely high computa...
Ever increasing memory latencies and deeper pipelines push memory farther from the processor. Prefet...
Data prefetching is an effective technique to hide memory latency and thus bridge the increasing pro...
Instruction cache miss latency is becoming an increasingly important performance bottleneck, especia...
The full text of this article is not available on SOAR. WSU users can access the article via IEEE Xp...
Instruction cache misses can severely limit the performance of both superscalar processors and high ...
As the trends of process scaling make memory system even more crucial bottleneck, the importance of ...
Processor performance has increased far faster than memories have been able to keep up with, forcing...
grantor: University of TorontoThe latency of accessing instructions and data from the memo...
Data-intensive applications often exhibit memory referencing patterns with little data reuse, result...
A common mechanism to perform hardware-based prefetching for regular accesses to arrays and chained...
Conventional cache prefetching approaches can be either hardware-based, generally by using a one-blo...
Cache performance analysis is becoming increasingly important in microprocessor design. This work ex...
International audienceWhen designing a prefetcher, the computer architect has to define which event ...
Recent technological advances are such that the gap between processor cycle times and memory cycle t...
In the last century great progress was achieved in developing processors with extremely high computa...
Ever increasing memory latencies and deeper pipelines push memory farther from the processor. Prefet...
Data prefetching is an effective technique to hide memory latency and thus bridge the increasing pro...