Abstract. Parallel functional programs based on the graph reduction execution model display considerable locality of reference, favouring the use of large cache lines in the implementation of the shared heap on a shared-memory multiprocessor. They also display a very high rate of synchronisation, making conventional weakly-consistent coherency pro-tocols ineective at avoiding unnecessary contention for write access to cache lines due to false sharing. We present the design of a specially adapted cache coherency protocol and show results of simulation exper-iments which demonstrate that the protocol allows spatial locality to be exploited to at least the level of a conventional invalidation protocol, but without the unnecessary serialisation...
The abstraction of a cache is useful to hide the vast difference in speed of computer processors and...
In this paper, we present compiler algorithms for detecting references to stale data in sharedmemory...
The cache coherence maintenance problem has been the major obstacle in using private cache memory to...
Abstract. Parallel graph reduction is a model for parallel program exe-cution in which shared-memory...
Parallel graph reduction is a conceptually simple model for the concurrent evaluation of lazy functi...
Parallel applications exhibit a wide variety of memory reference patterns. Designing a memory archit...
Thesis (Ph. D.)--University of Washington, 1987Shared-memory multiprocessors offer increased computa...
In this paper we present simulation algorithms that characterize the main sources of communication g...
. Data used by parallel programs can be divided into classes, based on how threads access it. For di...
Coherence induced cache misses are an important aspect limiting the scalability of shared memory par...
During the last few years many different memory consistency protocols have been proposed. These rang...
[[abstract]]A method of reducing false sharing in a shared memory system by enabling two caches to m...
Abstract—On many-core processors that do not provide hard-ware cache coherence, using shared memory ...
To reduce overhead of cache coherence enforcement in shared-bus multiprocessors, we propose a selfin...
Cache coherence is one of the main challenges to tackle when designing a shared-memory multiprocesso...
The abstraction of a cache is useful to hide the vast difference in speed of computer processors and...
In this paper, we present compiler algorithms for detecting references to stale data in sharedmemory...
The cache coherence maintenance problem has been the major obstacle in using private cache memory to...
Abstract. Parallel graph reduction is a model for parallel program exe-cution in which shared-memory...
Parallel graph reduction is a conceptually simple model for the concurrent evaluation of lazy functi...
Parallel applications exhibit a wide variety of memory reference patterns. Designing a memory archit...
Thesis (Ph. D.)--University of Washington, 1987Shared-memory multiprocessors offer increased computa...
In this paper we present simulation algorithms that characterize the main sources of communication g...
. Data used by parallel programs can be divided into classes, based on how threads access it. For di...
Coherence induced cache misses are an important aspect limiting the scalability of shared memory par...
During the last few years many different memory consistency protocols have been proposed. These rang...
[[abstract]]A method of reducing false sharing in a shared memory system by enabling two caches to m...
Abstract—On many-core processors that do not provide hard-ware cache coherence, using shared memory ...
To reduce overhead of cache coherence enforcement in shared-bus multiprocessors, we propose a selfin...
Cache coherence is one of the main challenges to tackle when designing a shared-memory multiprocesso...
The abstraction of a cache is useful to hide the vast difference in speed of computer processors and...
In this paper, we present compiler algorithms for detecting references to stale data in sharedmemory...
The cache coherence maintenance problem has been the major obstacle in using private cache memory to...