As small-scale shared memory multiprocessors proliferate in the market, it is very attractive to construct largescale systems by connecting smaller multiprocessors together in software using efficient commodity network interfaces and networks. Using a shared virtual memory (SVM) layer for this purpose preserves the attractive shared memory programming abstraction across nodes. In this paper: ffl We describe home-based SVM protocols that support symmetric multiprocessor (SMP) nodes, taking advantage of the intra-node hardware cache coherence and synchronization mechanisms. Our protocols take no special advantage of the network interface and network except as a fast communication link, and as such are very portable. We present the key design ...
We describe a methodology for developing high performance programs running on clusters of SMP nodes....
We introduce the SMTp architecture - an SMT processor augmented with a coherence protocol thread con...
We describe a methodology for developing high performance programs running on clusters of SMP no...
In this paper we examine how application performance scales on a state-of-the-art shared virtual mem...
Parallel systems supporting a shared memory programming interface have been implemented both in soft...
The performance of page-based software shared virtual memory (SVM) is still far from that achieved o...
Clusters of workstations have long provided a cost-effective, large-scale parallel computing platfor...
Many-core architectures of the future are likely to have distributed memory organizations and need f...
Commercial SMP nodes are an attractive building block for software distributed shared memory systems...
In this paper, we present the first system that implements OpenMP on a network of shared-memory mult...
Commercial SMP nodes are an attractive building block for software distributed shared memory systems...
Recently there has been a lot of effort in providing cost-effective Shared Memory systems by employi...
We first describe the design and implementation f a distributed shared memory system for a cluster o...
Low-latency, remote-write-access networks have recently become commodity items. These networks can c...
This paper describes a novel methodology for implementing a common set of collective communication o...
We describe a methodology for developing high performance programs running on clusters of SMP nodes....
We introduce the SMTp architecture - an SMT processor augmented with a coherence protocol thread con...
We describe a methodology for developing high performance programs running on clusters of SMP no...
In this paper we examine how application performance scales on a state-of-the-art shared virtual mem...
Parallel systems supporting a shared memory programming interface have been implemented both in soft...
The performance of page-based software shared virtual memory (SVM) is still far from that achieved o...
Clusters of workstations have long provided a cost-effective, large-scale parallel computing platfor...
Many-core architectures of the future are likely to have distributed memory organizations and need f...
Commercial SMP nodes are an attractive building block for software distributed shared memory systems...
In this paper, we present the first system that implements OpenMP on a network of shared-memory mult...
Commercial SMP nodes are an attractive building block for software distributed shared memory systems...
Recently there has been a lot of effort in providing cost-effective Shared Memory systems by employi...
We first describe the design and implementation f a distributed shared memory system for a cluster o...
Low-latency, remote-write-access networks have recently become commodity items. These networks can c...
This paper describes a novel methodology for implementing a common set of collective communication o...
We describe a methodology for developing high performance programs running on clusters of SMP nodes....
We introduce the SMTp architecture - an SMT processor augmented with a coherence protocol thread con...
We describe a methodology for developing high performance programs running on clusters of SMP no...