Abstract. Parallel graph reduction is a model for parallel program exe-cution in which shared-memory is used under a strict access regime with single assignment and blocking reads. We outline the design of an e-cient and accurate multiprocessor simulation scheme and the results of a simulation study of the performance of a suite of benchmark programs operating under a cache coherency protocol that is representative of pro-tocols used in commercial shared-memory machines and in more scalable distributed shared-memory systems. We analyse the in uence of cache line size on performance and expose the relative contributions of spatial, temporal and processor locality and false sharing to overall performance.
Emerging multiprocessor architectures such as chip multiprocessors, embedded architectures, and mas...
Coherence induced cache misses are an important aspect limiting the scalability of shared memory par...
Large-scale graph problems are becoming increasingly important in science and engineering. The irreg...
Abstract. Parallel functional programs based on the graph reduction execution model display consider...
Parallel graph reduction is a conceptually simple model for the concurrent evaluation of lazy functi...
Parallel graph reduction is a simple model for parallel program execution which uses the shared-memo...
Parallel applications exhibit a wide variety of memory reference patterns. Designing a memory archit...
Cache coherence is one of the main challenges to tackle when designing a shared-memory multiprocesso...
The abstraction of a cache is useful to hide the vast difference in speed of computer processors and...
A wide variety of computer architectures have been proposed to exploit parallelism at different gran...
Abstract—On many-core processors that do not provide hard-ware cache coherence, using shared memory ...
. Data used by parallel programs can be divided into classes, based on how threads access it. For di...
During the last few years many different memory consistency protocols have been proposed. These rang...
We present an analytical model of a cache coherent shared-memory multiprocessor and compare the resu...
In this research we built a SystemC Level-1 data cache system in a distributed shared memory archite...
Emerging multiprocessor architectures such as chip multiprocessors, embedded architectures, and mas...
Coherence induced cache misses are an important aspect limiting the scalability of shared memory par...
Large-scale graph problems are becoming increasingly important in science and engineering. The irreg...
Abstract. Parallel functional programs based on the graph reduction execution model display consider...
Parallel graph reduction is a conceptually simple model for the concurrent evaluation of lazy functi...
Parallel graph reduction is a simple model for parallel program execution which uses the shared-memo...
Parallel applications exhibit a wide variety of memory reference patterns. Designing a memory archit...
Cache coherence is one of the main challenges to tackle when designing a shared-memory multiprocesso...
The abstraction of a cache is useful to hide the vast difference in speed of computer processors and...
A wide variety of computer architectures have been proposed to exploit parallelism at different gran...
Abstract—On many-core processors that do not provide hard-ware cache coherence, using shared memory ...
. Data used by parallel programs can be divided into classes, based on how threads access it. For di...
During the last few years many different memory consistency protocols have been proposed. These rang...
We present an analytical model of a cache coherent shared-memory multiprocessor and compare the resu...
In this research we built a SystemC Level-1 data cache system in a distributed shared memory archite...
Emerging multiprocessor architectures such as chip multiprocessors, embedded architectures, and mas...
Coherence induced cache misses are an important aspect limiting the scalability of shared memory par...
Large-scale graph problems are becoming increasingly important in science and engineering. The irreg...