The reference stream reaching a chip multiprocessor Shared Last-Level Cache (SLLC) shows poor temporal locality, making conventional cache management policies inefficient. Few proposals address this problem for exclusive caches. In this paper, we propose the Reuse Detector (ReD), a new content selection mechanism for exclusive hierarchies that leverages reuse locality at the SLLC, a property that states that blocks referenced more than once are more likely to be accessed in the near future. Being placed between each L2 private cache and the SLLC, ReD prevents the insertion of blocks without reuse into the SLLC. It is designed to overcome problems affecting similar recent mechanisms (low accuracy, reduced visibility window and detector thra...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
Directory-based cache coherence is a popular mechanism for chip multiprocessors and multicores. The ...
Journal ArticleIn future multi-cores, large amounts of delay and power will be spent accessing data...
The reference stream reaching a chip multiprocessor Shared Last-Level Cache (SLLC) shows poor tempor...
In this paper, we propose a new block selection policy for Last-Level Caches (LLCs) that decides, ba...
With off-chip memory access taking 100's of processor cycles, getting data to the processor in a tim...
[EN] Multi-level buffer cache hierarchies are now commonly seen in most client/server cluster config...
Last-Level Cache (LLC) represents the bulk of a modern CPU processor's transistor budget and is esse...
Most chip-multiprocessors nowadays adopt a large shared last-level cache (SLLC). This paper is motiv...
Various constraints of Static Random Access Memory (SRAM) are leading to consider new memory technol...
Various constraints of Static Random Access Memory (SRAM) are leading to consider new memory technol...
As transistor density continues to grow geometrically, processor manufacturers are already able to p...
Memory latency has become an important performance bottleneck in current microprocessors. This probl...
Judicious management of on-chip last-level caches (LLC) is critical to alleviating the memory wall o...
Current architectural trends of rising on-chip core counts and worsening power-performance penalties...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
Directory-based cache coherence is a popular mechanism for chip multiprocessors and multicores. The ...
Journal ArticleIn future multi-cores, large amounts of delay and power will be spent accessing data...
The reference stream reaching a chip multiprocessor Shared Last-Level Cache (SLLC) shows poor tempor...
In this paper, we propose a new block selection policy for Last-Level Caches (LLCs) that decides, ba...
With off-chip memory access taking 100's of processor cycles, getting data to the processor in a tim...
[EN] Multi-level buffer cache hierarchies are now commonly seen in most client/server cluster config...
Last-Level Cache (LLC) represents the bulk of a modern CPU processor's transistor budget and is esse...
Most chip-multiprocessors nowadays adopt a large shared last-level cache (SLLC). This paper is motiv...
Various constraints of Static Random Access Memory (SRAM) are leading to consider new memory technol...
Various constraints of Static Random Access Memory (SRAM) are leading to consider new memory technol...
As transistor density continues to grow geometrically, processor manufacturers are already able to p...
Memory latency has become an important performance bottleneck in current microprocessors. This probl...
Judicious management of on-chip last-level caches (LLC) is critical to alleviating the memory wall o...
Current architectural trends of rising on-chip core counts and worsening power-performance penalties...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
Directory-based cache coherence is a popular mechanism for chip multiprocessors and multicores. The ...
Journal ArticleIn future multi-cores, large amounts of delay and power will be spent accessing data...