Improving Software Pipelining by Hiding Memory Latency with Combined Loads and Prefetches

Publication date

January 2016

Abstract

Modern processors and compilers hide long memory latencies through non-blocking loads or explicit software prefetching instructions. Unfortunately, each mechanism has potential drawbacks. Non-blocking loads can signifi-cantly increase register pressure by extending the lifetimes of loads. Software prefetching increases the number of memory instructions in the loop body. For a loop whose exe-cution time is bound by the number of loads/stores that can be issued per cycle, software prefetching exacerbates this problem and increases the number of idle computational cy-cles in loops. In this paper, we show how compiler and architecture support for combining a load and a prefetch into one in-struction, called a prefetching load, can give lower re...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Improving Software Pipelining by Hiding Memory Latency with Combined Loads and Prefetches

Abstract

Extracted data

Improving Software Pipelining by Hiding Memory Latency with Combined Loads and Prefetches

Abstract

Extracted data

Topics

Related items

Topics

Related items