Journal ArticleFor a parallel architecture to scale effectively, communication latency between processors must be avoided. We have found that the source of a large number of avoidable cache misses is the use of hardwired write-invalidate coherency protocols, which often exhibit high cache miss rates due to excessive invalidations and subsequent reloading of shared data. In the Avalanche project at the University of Utah, we are building a 64-node multiprocessor designed to reduce the end-to-end communication latency of both shared memory and message passing programs. As part of our design efforts, we are evaluating the potential performance benefits and implementation complexity of providing hardware support for multiple coherency protoc...
This paper considers a large scale, cache-based multiprocessor that is interconnected by a hierarchi...
grantor: University of TorontoImplementing multiple processors on a single chip is one of ...
[[abstract]]An optimization scheme for a directory-based cache coherence protocol for multistage int...
technical reportAs the gap between processor and memory speeds widens, system designers will inevita...
textThis dissertation explores techniques for reducing the costs of inter-processor communication i...
technical reportIn this paper, we describe the design of the Avalanche multiprocessor's shared memor...
Journal ArticleMinimizing communication latency in message passing multiprocessing systems is critic...
Shared-memory multiprocessors built from commodity microprocessors are being increasingly used to pr...
Shared-memory multiprocessors are becoming increasingly popular as a high-performance, easy to progr...
Thesis (Ph. D.)--University of Washington, 1987Shared-memory multiprocessors offer increased computa...
Due to VLSI lithography problems and the limitation of additional architectural enhancements uniproc...
Invalidation-based cache coherence protocols have been extensively studied in the context of large-s...
technical reportThe next generation of scalable parallel systems (e.g., machines by KSR, Convex, and...
Cache coherence protocols for shared-memory multiprocessors use invalidations or updates to maintain...
Cache coherence is one of the main challenges to tackle when designing a shared-memory multiprocesso...
This paper considers a large scale, cache-based multiprocessor that is interconnected by a hierarchi...
grantor: University of TorontoImplementing multiple processors on a single chip is one of ...
[[abstract]]An optimization scheme for a directory-based cache coherence protocol for multistage int...
technical reportAs the gap between processor and memory speeds widens, system designers will inevita...
textThis dissertation explores techniques for reducing the costs of inter-processor communication i...
technical reportIn this paper, we describe the design of the Avalanche multiprocessor's shared memor...
Journal ArticleMinimizing communication latency in message passing multiprocessing systems is critic...
Shared-memory multiprocessors built from commodity microprocessors are being increasingly used to pr...
Shared-memory multiprocessors are becoming increasingly popular as a high-performance, easy to progr...
Thesis (Ph. D.)--University of Washington, 1987Shared-memory multiprocessors offer increased computa...
Due to VLSI lithography problems and the limitation of additional architectural enhancements uniproc...
Invalidation-based cache coherence protocols have been extensively studied in the context of large-s...
technical reportThe next generation of scalable parallel systems (e.g., machines by KSR, Convex, and...
Cache coherence protocols for shared-memory multiprocessors use invalidations or updates to maintain...
Cache coherence is one of the main challenges to tackle when designing a shared-memory multiprocesso...
This paper considers a large scale, cache-based multiprocessor that is interconnected by a hierarchi...
grantor: University of TorontoImplementing multiple processors on a single chip is one of ...
[[abstract]]An optimization scheme for a directory-based cache coherence protocol for multistage int...