Prefetching is an important technique for reducing the average latency of memory accesses in scalable cache-coherent multiprocessors. Aggressive prefetching can significantly reduce the number of cache misses, but may introduce bursty network and memory traffic, and increase data sharing and cache pollution. Given that we anticipate enormous increases in both network bandwidth and latency, we examine whether aggressive prefetching triggered by a miss (cache-miss-initiated prefetching) can substantially improve the running time of parallel programs. Using execution-driven simulation of parallel programs on a scalable cache-coherent machine, we study the performance of three cache-miss-initiated prefetching techniques: large cache blocks, s...
High-performance I/O systems depend on prefetching and caching in order to deliver good performance ...
Chip multiprocessors (CMPs) present a unique scenario for software data prefetching with subtle trad...
This paper presents new analytical models of the performance be-nefits of multithreading and prefetc...
Compiler-directed cache prefetching has the poten-tial to hide much of the high memory latency seen ...
The full text of this article is not available on SOAR. WSU users can access the article via IEEE Xp...
Prefetching, i.e., exploiting the overlap of processor com-putations with data accesses, is one of s...
Memory latency has always been a major issue in shared-memory multiprocessors and high-speed systems...
In the last century great progress was achieved in developing processors with extremely high computa...
As the trends of process scaling make memory system even more crucial bottleneck, the importance of ...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/19...
Cache performance analysis is becoming increasingly important in microprocessor design. This work ex...
Memory stalls are a significant source of performance degradation in modern processors. Data prefetc...
Abstract As the difference in speed between processor and memory system continues to increase, it is...
A set of hybrid and adaptive prefetching schemes are considered in this paper. The prefetchers are h...
As the degree of instruction-level parallelism in superscalar architectures increases, the gap betwe...
High-performance I/O systems depend on prefetching and caching in order to deliver good performance ...
Chip multiprocessors (CMPs) present a unique scenario for software data prefetching with subtle trad...
This paper presents new analytical models of the performance be-nefits of multithreading and prefetc...
Compiler-directed cache prefetching has the poten-tial to hide much of the high memory latency seen ...
The full text of this article is not available on SOAR. WSU users can access the article via IEEE Xp...
Prefetching, i.e., exploiting the overlap of processor com-putations with data accesses, is one of s...
Memory latency has always been a major issue in shared-memory multiprocessors and high-speed systems...
In the last century great progress was achieved in developing processors with extremely high computa...
As the trends of process scaling make memory system even more crucial bottleneck, the importance of ...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/19...
Cache performance analysis is becoming increasingly important in microprocessor design. This work ex...
Memory stalls are a significant source of performance degradation in modern processors. Data prefetc...
Abstract As the difference in speed between processor and memory system continues to increase, it is...
A set of hybrid and adaptive prefetching schemes are considered in this paper. The prefetchers are h...
As the degree of instruction-level parallelism in superscalar architectures increases, the gap betwe...
High-performance I/O systems depend on prefetching and caching in order to deliver good performance ...
Chip multiprocessors (CMPs) present a unique scenario for software data prefetching with subtle trad...
This paper presents new analytical models of the performance be-nefits of multithreading and prefetc...