Distributed shared-memory architectures typically employ a directory-based protocol to maintain cache coherence. Identifying sharing patterns in parallel programs and applying specialized optimizations can increase cache-coherence protocol efficiency and yield performance improvements. In this thesis, I propose and study both optimizations to sharing patterns and techniques to identify sharing patterns. The main thrust of the thesis is GLOW, a comprehensive optimization for wide sharing---a sharing pattern that is a serious obstacle to scalability to large numbers of processors. I present GLOW in the form of extensions to the SCI ANSI/IEEE standard. GLOW is implemented in special network switches and incorporates characteristics that are n...
This paper considers a large scale, cache-based multiprocessor that is interconnected by a hierarchi...
Single-chip multiprocessors and multiple-thread architectures are becoming an affordable solution fo...
International audienceWith the emergence of manycore processors with potentially hundreds of process...
This thesis presents a new cache coherence protocol for shared bus multicache systems, and addresses...
One common cause of poor performance in large-scale shared-memory multiprocessors is limited memory ...
In this thesis we propose and evaluate an architecture to build large scale distributed shared memor...
Workstation networks can become teraFLOPS supercomputers by adding highspeed interfaces supporting s...
textThis dissertation explores techniques for reducing the costs of inter-processor communication i...
This paper proposes using shared memory for caching latency sensitive distributed data structures on...
As a result of advances in processor and network speeds, more and more applications can productively...
The transition to multi-core architectures can be attributed mainly to fundamental limitations in cl...
Multiprocessors with shared memory are considered more general and easier to program than message-pa...
Thesis (Ph. D.)--University of Washington, 1987Shared-memory multiprocessors offer increased computa...
This paper considers alternative directory protocols for providing cache coherence in shared-memory ...
It is our thesis that scalable synchronization can be achieved with only minimal hardware support, s...
This paper considers a large scale, cache-based multiprocessor that is interconnected by a hierarchi...
Single-chip multiprocessors and multiple-thread architectures are becoming an affordable solution fo...
International audienceWith the emergence of manycore processors with potentially hundreds of process...
This thesis presents a new cache coherence protocol for shared bus multicache systems, and addresses...
One common cause of poor performance in large-scale shared-memory multiprocessors is limited memory ...
In this thesis we propose and evaluate an architecture to build large scale distributed shared memor...
Workstation networks can become teraFLOPS supercomputers by adding highspeed interfaces supporting s...
textThis dissertation explores techniques for reducing the costs of inter-processor communication i...
This paper proposes using shared memory for caching latency sensitive distributed data structures on...
As a result of advances in processor and network speeds, more and more applications can productively...
The transition to multi-core architectures can be attributed mainly to fundamental limitations in cl...
Multiprocessors with shared memory are considered more general and easier to program than message-pa...
Thesis (Ph. D.)--University of Washington, 1987Shared-memory multiprocessors offer increased computa...
This paper considers alternative directory protocols for providing cache coherence in shared-memory ...
It is our thesis that scalable synchronization can be achieved with only minimal hardware support, s...
This paper considers a large scale, cache-based multiprocessor that is interconnected by a hierarchi...
Single-chip multiprocessors and multiple-thread architectures are becoming an affordable solution fo...
International audienceWith the emergence of manycore processors with potentially hundreds of process...