Coherence induced cache misses are an important aspect limiting the scalability of shared memory parallel programs. Many coherence misses are avoidable, namely misses due to false sharing – when different threads write to different memory addresses that are contained within the same cache block causing unnecessary invalidations. Our work leverages the domain of approximate computing and the value similarity within store values present in multi-threaded error-tolerant applications. We introduce a novel cache coherence protocol for approximate computing which implements an approximate store instruction and coherence states to allow some incoherence within approximatable shared data to mitigate both coherence misses and coherence traffic withi...
During the last few years many different memory consistency protocols have been proposed. These rang...
Abstract. Parallel functional programs based on the graph reduction execution model display consider...
In large scale machines, thousands of processor cycles, in other words, missed opportunities to issu...
Coherence induced cache misses are an important aspect limiting the scalability of shared memory par...
This thesis presents a new cache coherence protocol for shared bus multicache systems, and addresses...
Parallel applications exhibit a wide variety of memory reference patterns. Designing a memory archit...
Directory-based cache coherence protocol is accepted as the common technique in large scale shared m...
. Data used by parallel programs can be divided into classes, based on how threads access it. For di...
The abstraction of a cache is useful to hide the vast difference in speed of computer processors and...
Cache coherence is one of the main challenges to tackle when designing a shared-memory multiprocesso...
Providing a consistent view of the shared memory based on precise and well-defined semantics—memory ...
Abstract. Parallel graph reduction is a model for parallel program exe-cution in which shared-memory...
Emerging multiprocessor architectures such as chip multiprocessors, embedded architectures, and mas...
: Virtual memory based cache coherence is a mechanism that relies only on hardware that already exi...
Multicore computing have presented many challenges for system designers; one of which is data consis...
During the last few years many different memory consistency protocols have been proposed. These rang...
Abstract. Parallel functional programs based on the graph reduction execution model display consider...
In large scale machines, thousands of processor cycles, in other words, missed opportunities to issu...
Coherence induced cache misses are an important aspect limiting the scalability of shared memory par...
This thesis presents a new cache coherence protocol for shared bus multicache systems, and addresses...
Parallel applications exhibit a wide variety of memory reference patterns. Designing a memory archit...
Directory-based cache coherence protocol is accepted as the common technique in large scale shared m...
. Data used by parallel programs can be divided into classes, based on how threads access it. For di...
The abstraction of a cache is useful to hide the vast difference in speed of computer processors and...
Cache coherence is one of the main challenges to tackle when designing a shared-memory multiprocesso...
Providing a consistent view of the shared memory based on precise and well-defined semantics—memory ...
Abstract. Parallel graph reduction is a model for parallel program exe-cution in which shared-memory...
Emerging multiprocessor architectures such as chip multiprocessors, embedded architectures, and mas...
: Virtual memory based cache coherence is a mechanism that relies only on hardware that already exi...
Multicore computing have presented many challenges for system designers; one of which is data consis...
During the last few years many different memory consistency protocols have been proposed. These rang...
Abstract. Parallel functional programs based on the graph reduction execution model display consider...
In large scale machines, thousands of processor cycles, in other words, missed opportunities to issu...