Cache coherence protocols limit the scalability of multicore and manycore architectures and are responsible for an important amount of the power consumed in the chip. A good way to alleviate these problems is to introduce a local memory alongside the cache hierarchy, forming a hybrid memory system. Local memories are more power-efficient than caches and do not generate coherence traffic, but they suffer from poor programmability. When non-predictable memory access patterns are found, compilers do not succeed in generating code because of the incoherence between the two storages. This paper proposes a coherence protocol for hybrid memory systems that allows the compiler to generate code even in the presence of memory aliasing problems. Coher...
In a multiprocessor system on chip private caches introduce the cache coherence problem; because pro...
equipped with shared-memory, caches have significant impact on performance and energy consumption. I...
International audienceOne of the key challenges in chip multi-processing is to provide a programming...
Cache coherence protocols limit the scalability of multicore and manycore architectures and are resp...
The increasing number of cores in manycore architectures causes important power and scalability prob...
Both hardware-controlled and compiler-directed mechanisms have been proposed for maintaining cache c...
This work describes a cache architecture and memory model for 1000+ core microprocessors. Our appro...
Weak memory consistency models can maximize system performance by enabling hardware and compiler opt...
Weak memory consistency models can maximize system performance by enabling hardware and compiler opt...
Weak memory consistency models can maximize system performance by enabling hardware and compiler opt...
Reducing memory latency is critical to the performance of large-scale parallel systems. Due to the t...
Reducing memory latency is critical to the performance of large-scale parallel systems. Due to the t...
In programming high performance applications, shared address-space platforms are preferable for fine...
This work describes a cache architecture and memory model for 1000+ core microprocessors. Our appro...
In large scale machines, thousands of processor cycles, in other words, missed opportunities to issu...
In a multiprocessor system on chip private caches introduce the cache coherence problem; because pro...
equipped with shared-memory, caches have significant impact on performance and energy consumption. I...
International audienceOne of the key challenges in chip multi-processing is to provide a programming...
Cache coherence protocols limit the scalability of multicore and manycore architectures and are resp...
The increasing number of cores in manycore architectures causes important power and scalability prob...
Both hardware-controlled and compiler-directed mechanisms have been proposed for maintaining cache c...
This work describes a cache architecture and memory model for 1000+ core microprocessors. Our appro...
Weak memory consistency models can maximize system performance by enabling hardware and compiler opt...
Weak memory consistency models can maximize system performance by enabling hardware and compiler opt...
Weak memory consistency models can maximize system performance by enabling hardware and compiler opt...
Reducing memory latency is critical to the performance of large-scale parallel systems. Due to the t...
Reducing memory latency is critical to the performance of large-scale parallel systems. Due to the t...
In programming high performance applications, shared address-space platforms are preferable for fine...
This work describes a cache architecture and memory model for 1000+ core microprocessors. Our appro...
In large scale machines, thousands of processor cycles, in other words, missed opportunities to issu...
In a multiprocessor system on chip private caches introduce the cache coherence problem; because pro...
equipped with shared-memory, caches have significant impact on performance and energy consumption. I...
International audienceOne of the key challenges in chip multi-processing is to provide a programming...