Hardware prefetching on IBM’s latest POWER8 processor is able to improve performance of many applications significantly, but it can also cause performance loss for others. The IBM POWER8 processor provides one of the most sophisticated hardware prefetching designs which supports 225 different configurations. Obviously, it is a big challenge to find the optimal or near-optimal hardware prefetching configuration for a specific application. We present a dynamic prefetching tuning scheme in this paper, named Prefetch Automatic Tuner (PATer). PATer uses a prediction model based on machine learning to dynamically tune the prefetch configuration based on the values of hardware performance monitoring counters (PMCs). By developing a two-phase prefe...
A well known performance bottleneck in computer architecture is the so-called memory wall. This term...
International audienceIn multi-core systems, an application's prefetcher can interfere with the memo...
High performance processors employ hardware data prefetching to reduce the negative performance impa...
Hardware prefetching on IBM’s latest POWER8 processor is able to improve performance of many applica...
Abstract—Modern processors are equipped with multiple hardware prefetchers, each of which targets a ...
[EN] Current multi-core processors implement sophisticated hardware prefetchers, that can be configu...
Modern architectures provide hardware memory prefetching capabilities which can be configured at run...
The benefits of prefetching have been largely overshadowed by the overhead required to produce high...
International audienceData prefetching is an effective way to bridge the increasing performance gap ...
© 2020 IEEE. Personal use of this material is permitted. Permissíon from IEEE must be obtained for a...
Current microprocessors include several knobs to modify the hardware behavior in order to improve pe...
The widely acknowledged performance gap between processors and memory has been the subject of much r...
A major performance limiter in modern processors is the long latencies caused by data cache misses. ...
Abstract—Hardware prefetching improves system performance by hiding and tolerating the latencies of ...
Chip Multiprocessors (CMP) are an increasingly popular architecture and increasing numbers of vendor...
A well known performance bottleneck in computer architecture is the so-called memory wall. This term...
International audienceIn multi-core systems, an application's prefetcher can interfere with the memo...
High performance processors employ hardware data prefetching to reduce the negative performance impa...
Hardware prefetching on IBM’s latest POWER8 processor is able to improve performance of many applica...
Abstract—Modern processors are equipped with multiple hardware prefetchers, each of which targets a ...
[EN] Current multi-core processors implement sophisticated hardware prefetchers, that can be configu...
Modern architectures provide hardware memory prefetching capabilities which can be configured at run...
The benefits of prefetching have been largely overshadowed by the overhead required to produce high...
International audienceData prefetching is an effective way to bridge the increasing performance gap ...
© 2020 IEEE. Personal use of this material is permitted. Permissíon from IEEE must be obtained for a...
Current microprocessors include several knobs to modify the hardware behavior in order to improve pe...
The widely acknowledged performance gap between processors and memory has been the subject of much r...
A major performance limiter in modern processors is the long latencies caused by data cache misses. ...
Abstract—Hardware prefetching improves system performance by hiding and tolerating the latencies of ...
Chip Multiprocessors (CMP) are an increasingly popular architecture and increasing numbers of vendor...
A well known performance bottleneck in computer architecture is the so-called memory wall. This term...
International audienceIn multi-core systems, an application's prefetcher can interfere with the memo...
High performance processors employ hardware data prefetching to reduce the negative performance impa...