Ease of programming is one of the main impediments for the broad acceptance of multi-core systems with no hardware support for transparent data transfer between local and global memories. Software cache is a robust approach to provide the user with a transparent view of the memory architecture; but this software approach can suffer from poor performance. In this paper, we propose a hierarchical, hybrid software-cache architecture that classifies at compile time memory accesses in two classes, highlocality and irregular. Our approach then steers the memory references toward one of two specific cache structures optimized for their respective access pattern. The specific cache structures are optimized to enable high-level compiler optimization...
In this paper we propose an instruction to accelerate software caches. While DMAs are very efficient...
Many applications are memory intensive and thus are bounded by memory latency and bandwidth. While i...
... embedded devices to have the benefits of a memory hierarchy without the hardware costs. A softwa...
Ease of programming is one of the main impediments for the broad acceptance of multi-core systems wi...
This paper describes the implementation of a runtime library for asynchronous communication in the C...
The performance of a computing system heavily depends on the memory hierarchy. Fast but expensive ca...
The growing computing demands of emerging application domains such as Recognition/Mining/Synthesis (...
Cache coherence protocols limit the scalability of multicore and manycore architectures and are resp...
Abstract. This paper describes the implementation of a runtime library for asynchronous communicatio...
Despite the fact that the most viable L1 memories in processors are caches, on-chip local memories ...
Modern processors apply sophisticated techniques, such as deep cache hierarchies and hardware prefet...
The high performance delivered by modern computer system keeps scaling with an increasingnumber of p...
Shared memory provides an attractive and intuitive programming model that makes good use of programm...
The widening gap between processor and memory speeds renders data locality optimization a very impor...
In order to mitigate the impact of the constantly widening gap between processor speed and main memo...
In this paper we propose an instruction to accelerate software caches. While DMAs are very efficient...
Many applications are memory intensive and thus are bounded by memory latency and bandwidth. While i...
... embedded devices to have the benefits of a memory hierarchy without the hardware costs. A softwa...
Ease of programming is one of the main impediments for the broad acceptance of multi-core systems wi...
This paper describes the implementation of a runtime library for asynchronous communication in the C...
The performance of a computing system heavily depends on the memory hierarchy. Fast but expensive ca...
The growing computing demands of emerging application domains such as Recognition/Mining/Synthesis (...
Cache coherence protocols limit the scalability of multicore and manycore architectures and are resp...
Abstract. This paper describes the implementation of a runtime library for asynchronous communicatio...
Despite the fact that the most viable L1 memories in processors are caches, on-chip local memories ...
Modern processors apply sophisticated techniques, such as deep cache hierarchies and hardware prefet...
The high performance delivered by modern computer system keeps scaling with an increasingnumber of p...
Shared memory provides an attractive and intuitive programming model that makes good use of programm...
The widening gap between processor and memory speeds renders data locality optimization a very impor...
In order to mitigate the impact of the constantly widening gap between processor speed and main memo...
In this paper we propose an instruction to accelerate software caches. While DMAs are very efficient...
Many applications are memory intensive and thus are bounded by memory latency and bandwidth. While i...
... embedded devices to have the benefits of a memory hierarchy without the hardware costs. A softwa...