This paper advocates that cache coherence protocols use a bandwidth adaptive approach to adjust to varied system configurations (e.g., number of processors) and workload behaviors. We propose Bandwidth Adaptive Snooping Hybrid (BASH), a hybrid protocol that ranges from behaving like snooping (by broadcasting requests) when excess bandwidth is available to behaving like a directory protocol (by unicasting requests) when bandwidth is limited. BASH adapts dynamically by probabilistically deciding to broadcast or unicast on a per request basis using a local estimate of recent interconnection network utilization. Simulations of a microbenchmark and commercial and scientific workloads show that BASH robustly performs as well or better than the be...
Traditional coherence protocols present a set of difficult tradeoffs: the reliance of snoopy protoco...
Multicore systems have reached a stage where they are inevitable in the embedded world. This transit...
This project was designed to discover the relationship between the number of enabled rules maintaine...
Write-invalidate and write-broadcast coherency protocols have been criticized for being unable to ac...
Destination-set prediction can improve the latency/bandwidth tradeoff in shared-memory multiprocesso...
This invited paper argues that to facilitate formal verification, multiprocessor systems should (1) ...
The coherence protocol is a first-order design concern in multicore designs. Directory protocols are...
With transistor miniaturization leading to an abundance of on-chip resources and uniprocessor design...
Previous studies of bus-based shared-memo ~ multiprocessors have shown hybrid write-invalidate/write...
In this paper, we develop a specification methodology that documents and specifies a cache coherence...
Although directory-based write-invalidate cache coherence protocols have a potential to improve th...
We present a novel methodology for power reduction in embedded multiprocessor systems. Maintaining l...
Design complexity and limited power budget are causing the number of cores on the same chip to grow ...
The speed of processors increases much faster than the memory access time. This makes memory accesse...
With the advent of big data, modern businesses face an increasing need to store and process large vo...
Traditional coherence protocols present a set of difficult tradeoffs: the reliance of snoopy protoco...
Multicore systems have reached a stage where they are inevitable in the embedded world. This transit...
This project was designed to discover the relationship between the number of enabled rules maintaine...
Write-invalidate and write-broadcast coherency protocols have been criticized for being unable to ac...
Destination-set prediction can improve the latency/bandwidth tradeoff in shared-memory multiprocesso...
This invited paper argues that to facilitate formal verification, multiprocessor systems should (1) ...
The coherence protocol is a first-order design concern in multicore designs. Directory protocols are...
With transistor miniaturization leading to an abundance of on-chip resources and uniprocessor design...
Previous studies of bus-based shared-memo ~ multiprocessors have shown hybrid write-invalidate/write...
In this paper, we develop a specification methodology that documents and specifies a cache coherence...
Although directory-based write-invalidate cache coherence protocols have a potential to improve th...
We present a novel methodology for power reduction in embedded multiprocessor systems. Maintaining l...
Design complexity and limited power budget are causing the number of cores on the same chip to grow ...
The speed of processors increases much faster than the memory access time. This makes memory accesse...
With the advent of big data, modern businesses face an increasing need to store and process large vo...
Traditional coherence protocols present a set of difficult tradeoffs: the reliance of snoopy protoco...
Multicore systems have reached a stage where they are inevitable in the embedded world. This transit...
This project was designed to discover the relationship between the number of enabled rules maintaine...