A multiprocessor prefetch scheme is described in which a miss is followed by a prefetch of a group of lines, a neighborhood, surrounding the demand-fetched line. The neighborhood is based on the data address and the past behavior of the instruction that missed the cache. A neighborhood for an instruction is constructed by recording the offsets of addresses that subsequently miss. This neighborhood prefetching can exploit sequential access as can sequential prefetch and can to some extent exploit stride access, as can stride prefetch. Unlike stride and sequential prefetch it can support irregular access patterns. Neighborhood prefetching was compared to adaptive sequential prefetching using execution-driven simulation. Results show more usef...
textModern computer systems spend a substantial fraction of their running time waiting for data from...
AbstractMemory access latency is a main bottleneck limiting further improvement of multi-core proces...
It is well known that memory latency is a major deterrent to achieving the maximum possible performa...
Instruction cache miss latency is becoming an increasingly important performance bottleneck, especia...
Instruction cache misses can severely limit the performance of both superscalar processors and high ...
Memory latency has always been a major issue in shared-memory multiprocessors and high-speed systems...
Abstract—Data prefetching of regular access patterns is an effective mechanism to hide the memory la...
In this paper, we present our design of a high performance prefetcher, which exploits various locali...
Prefetching, i.e., exploiting the overlap of processor com-putations with data accesses, is one of s...
The large latency of memory accesses in modern computer systems is a key obstacle to achieving high ...
This paper presents a novel pointer prefetching technique, called multi-chain prefetching. Multi-cha...
Prefetching is an important technique for reducing the average latency of memory accesses in scalabl...
Hardly predictable data addresses in man), irregular applica-tions have rendered prefetching ineffec...
A major performance limiter in modern processors is the long latencies caused by data cache misses. ...
In the last century great progress was achieved in developing processors with extremely high computa...
textModern computer systems spend a substantial fraction of their running time waiting for data from...
AbstractMemory access latency is a main bottleneck limiting further improvement of multi-core proces...
It is well known that memory latency is a major deterrent to achieving the maximum possible performa...
Instruction cache miss latency is becoming an increasingly important performance bottleneck, especia...
Instruction cache misses can severely limit the performance of both superscalar processors and high ...
Memory latency has always been a major issue in shared-memory multiprocessors and high-speed systems...
Abstract—Data prefetching of regular access patterns is an effective mechanism to hide the memory la...
In this paper, we present our design of a high performance prefetcher, which exploits various locali...
Prefetching, i.e., exploiting the overlap of processor com-putations with data accesses, is one of s...
The large latency of memory accesses in modern computer systems is a key obstacle to achieving high ...
This paper presents a novel pointer prefetching technique, called multi-chain prefetching. Multi-cha...
Prefetching is an important technique for reducing the average latency of memory accesses in scalabl...
Hardly predictable data addresses in man), irregular applica-tions have rendered prefetching ineffec...
A major performance limiter in modern processors is the long latencies caused by data cache misses. ...
In the last century great progress was achieved in developing processors with extremely high computa...
textModern computer systems spend a substantial fraction of their running time waiting for data from...
AbstractMemory access latency is a main bottleneck limiting further improvement of multi-core proces...
It is well known that memory latency is a major deterrent to achieving the maximum possible performa...