Modern architectures provide hardware memory prefetching capabilities which can be configured at runtime. While hardware prefetching can provide substantial performance improvements for many programs, prefetching can also increase contention for shared resources such as last-level cache and memory bandwidth. In turn, this contention can degrade performance in multi-core workloads. In this paper, we model fine-grained hardware prefetcher control as a contextual bandit, and propose a framework for learning prefetcher control policies which adjust hardware prefetching usage at runtime according to workload performance behavior. We train our policies on profiling data, wherein hardware memory prefetchers are enabled or disabled randomly at regu...
Loads that miss in L1 or L2 caches and waiting for their data at the head of the ROB cause signicant...
Despite large caches, main-memory access latencies still cause significant performance losses in man...
pre-printMemory latency is a major factor in limiting CPU per- formance, and prefetching is a well-k...
Abstract—Modern processors are equipped with multiple hardware prefetchers, each of which targets a ...
International audienceData prefetching is an effective way to bridge the increasing performance gap ...
Hardware prefetching on IBM’s latest POWER8 processor is able to improve performance of many applica...
International audienceIn multi-core systems, an application's prefetcher can interfere with the memo...
An important technique for alleviating the memory bottleneck is data prefetching. Data prefetching ...
A well known performance bottleneck in computer architecture is the so-called memory wall. This term...
A major performance limiter in modern processors is the long latencies caused by data cache misses. ...
Abstract. Given the increasing gap between processors and memory, prefetching data into cache become...
High performance processors employ hardware data prefetching to reduce the negative performance impa...
Machine Learning (ML) has gained prominence in recent years and is currently being used in a wide ra...
he Von Neumann bottleneck is a persistent problem in computer architecture, causing stalls and waste...
The widely acknowledged performance gap between processors and memory has been the subject of much r...
Loads that miss in L1 or L2 caches and waiting for their data at the head of the ROB cause signicant...
Despite large caches, main-memory access latencies still cause significant performance losses in man...
pre-printMemory latency is a major factor in limiting CPU per- formance, and prefetching is a well-k...
Abstract—Modern processors are equipped with multiple hardware prefetchers, each of which targets a ...
International audienceData prefetching is an effective way to bridge the increasing performance gap ...
Hardware prefetching on IBM’s latest POWER8 processor is able to improve performance of many applica...
International audienceIn multi-core systems, an application's prefetcher can interfere with the memo...
An important technique for alleviating the memory bottleneck is data prefetching. Data prefetching ...
A well known performance bottleneck in computer architecture is the so-called memory wall. This term...
A major performance limiter in modern processors is the long latencies caused by data cache misses. ...
Abstract. Given the increasing gap between processors and memory, prefetching data into cache become...
High performance processors employ hardware data prefetching to reduce the negative performance impa...
Machine Learning (ML) has gained prominence in recent years and is currently being used in a wide ra...
he Von Neumann bottleneck is a persistent problem in computer architecture, causing stalls and waste...
The widely acknowledged performance gap between processors and memory has been the subject of much r...
Loads that miss in L1 or L2 caches and waiting for their data at the head of the ROB cause signicant...
Despite large caches, main-memory access latencies still cause significant performance losses in man...
pre-printMemory latency is a major factor in limiting CPU per- formance, and prefetching is a well-k...