Distributed shared-memory systems provide scalable performance and a convenient model for parallel programming. However, their non-uniform memory latency often makes it difficult to develop efficient parallel applications. Future systems should reduce communication cost to achieve better programmability and performance. We have developed a methodology, and implemented a suite of tools, to guide the search for improved codes and systems. As the result of one such search, we recommend a remote data caching technique that significantly reduces communication cost
Distributed memory parallel architectures support a memory model where some memory accesses are loca...
Estimating communication cost involved in executing a program on distributed memory machines is impo...
Programming using message passing is considered difficult and therefore many researchers have propos...
Distributed shared-memory systems provide scalable performance and a convenient model for parallel p...
Distributed shared-memory systems provide scalable performance and a convenient model for parallel p...
textThis dissertation explores techniques for reducing the costs of inter-processor communication i...
textThis dissertation explores techniques for reducing the costs of inter-processor communication i...
The transition to multi-core architectures can be attributed mainly to fundamental limitations in cl...
Shared memory is widely regarded as a more intuitive model than message passing for the development ...
Since the invention of the transistor, clock frequency increase was the primary method of improving ...
Since the invention of the transistor, clock frequency increase was the primary method of improving ...
Since the invention of the transistor, clock frequency increase was the primary method of improving ...
Thesis (Ph. D.)--University of Washington, 1997Two recent trends are affecting the design of medium-...
Recent achievements in high-performance computing significantly narrow the performance gap between s...
Shared memory systems generally support consumerinitiated communication; when a process needs data,...
Distributed memory parallel architectures support a memory model where some memory accesses are loca...
Estimating communication cost involved in executing a program on distributed memory machines is impo...
Programming using message passing is considered difficult and therefore many researchers have propos...
Distributed shared-memory systems provide scalable performance and a convenient model for parallel p...
Distributed shared-memory systems provide scalable performance and a convenient model for parallel p...
textThis dissertation explores techniques for reducing the costs of inter-processor communication i...
textThis dissertation explores techniques for reducing the costs of inter-processor communication i...
The transition to multi-core architectures can be attributed mainly to fundamental limitations in cl...
Shared memory is widely regarded as a more intuitive model than message passing for the development ...
Since the invention of the transistor, clock frequency increase was the primary method of improving ...
Since the invention of the transistor, clock frequency increase was the primary method of improving ...
Since the invention of the transistor, clock frequency increase was the primary method of improving ...
Thesis (Ph. D.)--University of Washington, 1997Two recent trends are affecting the design of medium-...
Recent achievements in high-performance computing significantly narrow the performance gap between s...
Shared memory systems generally support consumerinitiated communication; when a process needs data,...
Distributed memory parallel architectures support a memory model where some memory accesses are loca...
Estimating communication cost involved in executing a program on distributed memory machines is impo...
Programming using message passing is considered difficult and therefore many researchers have propos...