We present design details and some initial performance results of a novel scalable shared memory multiprocessor architecture that incorporates the major strengths of several contemporary multiprocessor architectures while avoiding their most serious weaknesses. Specifically, our architecture design incorporates the automatic data migration and replication features of cache-only memory architecture (COMA) machines, but replaces much of the complex hardware of COMA with a software layer that manages page-grained cache space allocation, as found in distributed virtual shared memory (DVSM) systems. Unlike DVSM however, pages are sub-divided into cache-line sized blocks, and for shared pages the coherence of these blocks is maintained by hardwar...
Shared memory provides an attractive and intuitive programming model that makes good use of programm...
As microprocessors become faster and demand more bandwidth, the already limited scalability of a sha...
We argue that OS-provided data coherence on non-cache-coherent NUMA multiprocessors (machines with a...
We present design details and some initial performance results of a novel scalable shared memory mul...
We present design details and some initial performance results of a novel scalable shared memory mu...
We present design details and some initial performance results of a novel scalable shared memory mu...
Multiprocessors with shared memory are considered more general and easier to program than message-pa...
As microprocessors become faster and demand more bandwidth the already limited scalability of a shar...
Multiprocessors with shared memory are considered more general and easier to program than message-pa...
Traditionally, cache coherence in multiprocessors has been maintained in hardware. However, the cost...
Scalable shared memory multiprocessors traditionally use either a cache coherent nonuniform memory a...
Cache-only memory architecture (COMA) machines treat their entire memory as cache, thereby allowing ...
Large-scale multiprocessors suffer from long latencies for remote accesses. Caching is by far the ...
this paper, however the Simple COMA design removes a significant amount of complexity from the hardw...
Large-scale multiprocessors suffer from long latencies for remote accesses. Caching is by far the mo...
Shared memory provides an attractive and intuitive programming model that makes good use of programm...
As microprocessors become faster and demand more bandwidth, the already limited scalability of a sha...
We argue that OS-provided data coherence on non-cache-coherent NUMA multiprocessors (machines with a...
We present design details and some initial performance results of a novel scalable shared memory mul...
We present design details and some initial performance results of a novel scalable shared memory mu...
We present design details and some initial performance results of a novel scalable shared memory mu...
Multiprocessors with shared memory are considered more general and easier to program than message-pa...
As microprocessors become faster and demand more bandwidth the already limited scalability of a shar...
Multiprocessors with shared memory are considered more general and easier to program than message-pa...
Traditionally, cache coherence in multiprocessors has been maintained in hardware. However, the cost...
Scalable shared memory multiprocessors traditionally use either a cache coherent nonuniform memory a...
Cache-only memory architecture (COMA) machines treat their entire memory as cache, thereby allowing ...
Large-scale multiprocessors suffer from long latencies for remote accesses. Caching is by far the ...
this paper, however the Simple COMA design removes a significant amount of complexity from the hardw...
Large-scale multiprocessors suffer from long latencies for remote accesses. Caching is by far the mo...
Shared memory provides an attractive and intuitive programming model that makes good use of programm...
As microprocessors become faster and demand more bandwidth, the already limited scalability of a sha...
We argue that OS-provided data coherence on non-cache-coherent NUMA multiprocessors (machines with a...