The increasing performance gap between processors and memory will force future architectures to devote significant resources towards removing and hiding memory latency. The two major architectural features used to address this grow-ing gap are caches and prefetching. In this paper we perform a detailed quantification of the cache miss patterns for the Olden benchmarks, SPEC 2000 benchmarks, and a collection of pointer based applications. We classify misses into one of four categories correspond-ing to the type of access pattern. These are next-line, stride, same-object (additional misses that occur to a recently ac-cessed object), or pointer-based transitions. We then pro-pose and evaluate a hardware profiling architecture to cor-rectly ide...
Data cache misses reduce the performance of wide-issue processors by stalling the data supply to the...
Recent technological advances are such that the gap between processor cycle times and memory cycle t...
This thesis considers two approaches to the design of high-performance computers. In a single proces...
CPU speeds double approximately every eighteen months, while main memory speeds double only about ev...
A major performance limiter in modern processors is the long latencies caused by data cache misses. ...
In this dissertation, we provide hardware solutions to increase the efficiency of the cache hierarch...
A well known performance bottleneck in computer architecture is the so-called memory wall. This term...
Processor performance has increased far faster than memories have been able to keep up with, forcing...
Abstract. Given the increasing gap between processors and memory, prefetching data into cache become...
grantor: University of TorontoThe latency of accessing instructions and data from the memo...
Compiler-directed cache prefetching has the poten-tial to hide much of the high memory latency seen ...
As the trends of process scaling make memory system even more crucial bottleneck, the importance of ...
As the degree of instruction-level parallelism in superscalar architectures increases, the gap betwe...
Prefetching, i.e., exploiting the overlap of processor com-putations with data accesses, is one of s...
Cache performance analysis is becoming increasingly important in microprocessor design. This work ex...
Data cache misses reduce the performance of wide-issue processors by stalling the data supply to the...
Recent technological advances are such that the gap between processor cycle times and memory cycle t...
This thesis considers two approaches to the design of high-performance computers. In a single proces...
CPU speeds double approximately every eighteen months, while main memory speeds double only about ev...
A major performance limiter in modern processors is the long latencies caused by data cache misses. ...
In this dissertation, we provide hardware solutions to increase the efficiency of the cache hierarch...
A well known performance bottleneck in computer architecture is the so-called memory wall. This term...
Processor performance has increased far faster than memories have been able to keep up with, forcing...
Abstract. Given the increasing gap between processors and memory, prefetching data into cache become...
grantor: University of TorontoThe latency of accessing instructions and data from the memo...
Compiler-directed cache prefetching has the poten-tial to hide much of the high memory latency seen ...
As the trends of process scaling make memory system even more crucial bottleneck, the importance of ...
As the degree of instruction-level parallelism in superscalar architectures increases, the gap betwe...
Prefetching, i.e., exploiting the overlap of processor com-putations with data accesses, is one of s...
Cache performance analysis is becoming increasingly important in microprocessor design. This work ex...
Data cache misses reduce the performance of wide-issue processors by stalling the data supply to the...
Recent technological advances are such that the gap between processor cycle times and memory cycle t...
This thesis considers two approaches to the design of high-performance computers. In a single proces...