Nonuniformity is a common characteristic of contemporary computer systems, mainly because of physical distances in computer designs. In large multiprocessors, the access to shared memory is often nonuniform, and may vary as much as ten times for some nonuniform memory access (NUMA) architectures, depending on if the memory is close to the requesting processor or not. Much research has been devoted to optimizing such systems. This thesis identifies another important property of computer designs, nonuniform communication architecture (NUCA). High-end hardware-coherent machines built from a few large nodes or from chip multiprocessors, are typical NUCA systems that have a lower penalty for reading recently written data from a neighbor's cache ...
Communication and synchronization stand as the dual bottlenecks in the performance of parallel syste...
Shared memory multiprocessors make it practical to convert sequential programs to parallel ones in...
Multiprocessors with shared memory are considered more general and easier to program than message-pa...
Shared memory provides an attractive and intuitive programming model that makes good use of programm...
Shared memory is widely regarded as a more intuitive model than message passing for the development ...
We argue that OS-provided data coherence on non-cache-coherent NUMA multiprocessors (machines with a...
Thesis (Ph. D.)--University of Washington, 1997Two recent trends are affecting the design of medium-...
Due to their excellent price-performance ratio, clusters built from commodity nodes have become broa...
textThis dissertation explores techniques for reducing the costs of inter-processor communication i...
High-end embedded systems, like their general-purpose counterparts, are turning to many-core cluster...
Distributed shared-memory systems provide scalable performance and a convenient model for parallel p...
If the trend of integrating more and more cores to a single die continues, general-purpose processor...
Abstract—A solution adopted in the past to design high perfor-mance multiprocessors systems that wer...
Software distributed shared memory (DSM) platforms on networks of workstations tolerate large networ...
Abstract. Synchronization in parallel programs is a major performance bottleneck. Shared data is pro...
Communication and synchronization stand as the dual bottlenecks in the performance of parallel syste...
Shared memory multiprocessors make it practical to convert sequential programs to parallel ones in...
Multiprocessors with shared memory are considered more general and easier to program than message-pa...
Shared memory provides an attractive and intuitive programming model that makes good use of programm...
Shared memory is widely regarded as a more intuitive model than message passing for the development ...
We argue that OS-provided data coherence on non-cache-coherent NUMA multiprocessors (machines with a...
Thesis (Ph. D.)--University of Washington, 1997Two recent trends are affecting the design of medium-...
Due to their excellent price-performance ratio, clusters built from commodity nodes have become broa...
textThis dissertation explores techniques for reducing the costs of inter-processor communication i...
High-end embedded systems, like their general-purpose counterparts, are turning to many-core cluster...
Distributed shared-memory systems provide scalable performance and a convenient model for parallel p...
If the trend of integrating more and more cores to a single die continues, general-purpose processor...
Abstract—A solution adopted in the past to design high perfor-mance multiprocessors systems that wer...
Software distributed shared memory (DSM) platforms on networks of workstations tolerate large networ...
Abstract. Synchronization in parallel programs is a major performance bottleneck. Shared data is pro...
Communication and synchronization stand as the dual bottlenecks in the performance of parallel syste...
Shared memory multiprocessors make it practical to convert sequential programs to parallel ones in...
Multiprocessors with shared memory are considered more general and easier to program than message-pa...