Dynamically tagged directories are memory-efficient mechanisms for maintaining cache coherence in large-scale shared-memory multiprocessors. These directories use special-purpose tag caches that are subject to two types of overflow: 1) pointer overflow, which limits the maximum number of processors that can simultaneously share a memory block, and 2) set overflow, which forces the premature invalidation of cached memory blocks. These invalidations of actively referenced blocks cause extra data cache misses, which can reduce the system's memory performance. We propose a superassociative tagged directory structure that can preserve some of the cached copies of a memory block when a set overflows by allowing multiple address tags in the ...
We characterize the cache behavior of an in-memory tag table and demonstrate that an optimized imple...
We propose a novel energy-efficient memory architecture which relies on the use of cache with a redu...
Caches have the potential to provide multiprocessors with an automatic mechanism for reducing both n...
Dynamically tagged directories have been proposed as a memory-efficient mechanism for maintaining ca...
Directory-based cache coherence protocol is accepted as the common technique in large scale shared m...
Emerging multiprocessor architectures such as chip multiprocessors, embedded architectures, and mas...
Recent research shows that the occupancy of the coherence controllers is a major performance bottlen...
Cache coherence problem is a major concern in the design of shared-memory multiprocessors. As the nu...
A key challenge in architecting a multicore processor is efficiently maintaining cache coherence. Di...
Both hardware-controlled and compiler-directed mechanisms have been proposed for maintaining cache c...
Parallel applications exhibit a wide variety of memory reference patterns. Designing a memory archit...
This thesis presents a new cache coherence protocol for shared bus multicache systems, and addresses...
[[abstract]]A cache coherence protocol for a multiprocessor system. Each processor in the system has...
In large scale machines, thousands of processor cycles, in other words, missed opportunities to issu...
This paper presents a non-blocking directory-based cache coherence protocol to improve the performan...
We characterize the cache behavior of an in-memory tag table and demonstrate that an optimized imple...
We propose a novel energy-efficient memory architecture which relies on the use of cache with a redu...
Caches have the potential to provide multiprocessors with an automatic mechanism for reducing both n...
Dynamically tagged directories have been proposed as a memory-efficient mechanism for maintaining ca...
Directory-based cache coherence protocol is accepted as the common technique in large scale shared m...
Emerging multiprocessor architectures such as chip multiprocessors, embedded architectures, and mas...
Recent research shows that the occupancy of the coherence controllers is a major performance bottlen...
Cache coherence problem is a major concern in the design of shared-memory multiprocessors. As the nu...
A key challenge in architecting a multicore processor is efficiently maintaining cache coherence. Di...
Both hardware-controlled and compiler-directed mechanisms have been proposed for maintaining cache c...
Parallel applications exhibit a wide variety of memory reference patterns. Designing a memory archit...
This thesis presents a new cache coherence protocol for shared bus multicache systems, and addresses...
[[abstract]]A cache coherence protocol for a multiprocessor system. Each processor in the system has...
In large scale machines, thousands of processor cycles, in other words, missed opportunities to issu...
This paper presents a non-blocking directory-based cache coherence protocol to improve the performan...
We characterize the cache behavior of an in-memory tag table and demonstrate that an optimized imple...
We propose a novel energy-efficient memory architecture which relies on the use of cache with a redu...
Caches have the potential to provide multiprocessors with an automatic mechanism for reducing both n...