This paper presents a novel pointer prefetching technique, called multi-chain prefetching. Multi-chain prefetching tolerates serialized memory latency commonly found in pointer chasing codes via aggressive prefetch scheduling. Unlike conventional prefetching techniques that hide memory latency underneath a single traversal loop or recursive function exclusively, multi-chain prefetching initiates prefetches for a chain of pointers prior to the traversal code, thus exploiting \pre-loop " work to help overlap serialized memory latency. As prefetch chains are scheduled increasingly early to accommodate long pointer chains, multi-chain prefetching overlaps prefetches across multiple independent linked structures, thus exploiting the nat...
The large latency of memory accesses in modern computers is a key obstacle in achieving high process...
Supercomputer architectures are not as fast as logic technology allows because memories are slow th...
Instruction cache miss latency is becoming an increasingly important performance bottleneck, especia...
Pointer-chasing applications tend to traverse composed data structures consisting of multiple indepe...
Software-controlled data prefetching offers the potential for bridging the ever-increasing speed gap...
Memory latency becoming an increasing important performance bottleneck as the gap between processor ...
CPU speeds double approximately every eighteen months, while main memory speeds double only about ev...
While many parallel applications exhibit good spatial locality, other important codes in areas like ...
Data prefetching effectively reduces the negative effects of long load latencies on the performance ...
A multiprocessor prefetch scheme is described in which a miss is followed by a prefetch of a group o...
This dissertation considers the use of data prefetching and an alternative mechanism, data forwardin...
In recent years, processor speed has become increasingly faster than memory speed. One technique for...
In recent years, processor speed has become increasingly faster than memory speed. One technique for...
grantor: University of TorontoThe latency of accessing instructions and data from the memo...
Recent advances in integrating logic and DRAM on the same chip potentially open up new avenues for a...
The large latency of memory accesses in modern computers is a key obstacle in achieving high process...
Supercomputer architectures are not as fast as logic technology allows because memories are slow th...
Instruction cache miss latency is becoming an increasingly important performance bottleneck, especia...
Pointer-chasing applications tend to traverse composed data structures consisting of multiple indepe...
Software-controlled data prefetching offers the potential for bridging the ever-increasing speed gap...
Memory latency becoming an increasing important performance bottleneck as the gap between processor ...
CPU speeds double approximately every eighteen months, while main memory speeds double only about ev...
While many parallel applications exhibit good spatial locality, other important codes in areas like ...
Data prefetching effectively reduces the negative effects of long load latencies on the performance ...
A multiprocessor prefetch scheme is described in which a miss is followed by a prefetch of a group o...
This dissertation considers the use of data prefetching and an alternative mechanism, data forwardin...
In recent years, processor speed has become increasingly faster than memory speed. One technique for...
In recent years, processor speed has become increasingly faster than memory speed. One technique for...
grantor: University of TorontoThe latency of accessing instructions and data from the memo...
Recent advances in integrating logic and DRAM on the same chip potentially open up new avenues for a...
The large latency of memory accesses in modern computers is a key obstacle in achieving high process...
Supercomputer architectures are not as fast as logic technology allows because memories are slow th...
Instruction cache miss latency is becoming an increasingly important performance bottleneck, especia...