ABSTRACT The goal of this project was to improve the performance of large scientific and engineering applications through collaborative hardware and software mechanisms to manage the memory hierarchy of non-uniform memory access time (NUMA) shared-memory machines, as well as their component individual processors. In spite of the programming advantages of shared-memory platforms, obtaining good performance for large scientific and engineering applications on such machines can be challenging. Because communication between processors is managed implicitly by the hardware, rather than expressed by the programmer, application performance may suffer from unintended communication – communication that the programmer did not consider when developing...
As the number of NUMA system\u27s cache coherency protocols based on the IEEE Std. 1596-1992, Standa...
Cache coherence is one of the main challenges to tackle when designing a shared-memory multiprocesso...
As multicore systems become widespread, both software and hardware face a major challenge in efficie...
Recent distributed shared memory (DSM) systems and proposed shared-memory machines have implemented ...
Abstract—Scalable distributed shared-memory architectures rely on coherence controllers on each proc...
The goal of this work is to explore architectural mechanisms for supporting explicit communication...
Recent distributed shared memoy (DSM) systems and proposed shared-memory machines have imple-mented ...
We are currently designing Sparks, a protocol construction library that we hope will allow us to im...
A key goal of the Stanford FLASH project is to explore the integration of multiple communication pro...
The increasing number of cores in manycore architectures causes important power and scalability prob...
Commercial workload and technology trends are pushing existing shared-memory multiprocessor coherenc...
We argue that OS-provided data coherence on non-cache-coherent NUMA multiprocessors (machines with a...
Several multiprocessors have been proposed that offer programmable implementations of scalable cache...
Nonuniformity is a common characteristic of contemporary computer systems, mainly because of physica...
multiprocessor efficiently integrates support for cache-coherent shared memory and high-performance ...
As the number of NUMA system\u27s cache coherency protocols based on the IEEE Std. 1596-1992, Standa...
Cache coherence is one of the main challenges to tackle when designing a shared-memory multiprocesso...
As multicore systems become widespread, both software and hardware face a major challenge in efficie...
Recent distributed shared memory (DSM) systems and proposed shared-memory machines have implemented ...
Abstract—Scalable distributed shared-memory architectures rely on coherence controllers on each proc...
The goal of this work is to explore architectural mechanisms for supporting explicit communication...
Recent distributed shared memoy (DSM) systems and proposed shared-memory machines have imple-mented ...
We are currently designing Sparks, a protocol construction library that we hope will allow us to im...
A key goal of the Stanford FLASH project is to explore the integration of multiple communication pro...
The increasing number of cores in manycore architectures causes important power and scalability prob...
Commercial workload and technology trends are pushing existing shared-memory multiprocessor coherenc...
We argue that OS-provided data coherence on non-cache-coherent NUMA multiprocessors (machines with a...
Several multiprocessors have been proposed that offer programmable implementations of scalable cache...
Nonuniformity is a common characteristic of contemporary computer systems, mainly because of physica...
multiprocessor efficiently integrates support for cache-coherent shared memory and high-performance ...
As the number of NUMA system\u27s cache coherency protocols based on the IEEE Std. 1596-1992, Standa...
Cache coherence is one of the main challenges to tackle when designing a shared-memory multiprocesso...
As multicore systems become widespread, both software and hardware face a major challenge in efficie...