In this paper we propose an instruction to accelerate software caches. While DMAs are very efficient for predictable data sets that can be fetched before they are needed, they introduce a large latency overhead for computations with unpredictable access behavior. Software caches are advantageous when the data set is not predictable but exhibits locality. However, software caches also incur a large overhead. Because the main overhead is in the access function, we propose an instruction that replaces the look-up function of the software cache. This instruction is evaluated using the Multidimensional Software Cache and two multimedia kernels, GLCM and H.264 Motion Compensation. The results show that the proposed instruction accelerates the sof...
Ease of programming is one of the main impediments for the broad acceptance of multi-core systems wi...
Cache becomes very important in high-load computer application. In a web application, cache can impr...
The memory system remains a major performance bottleneck in modern and future architectures. In this...
We explore the use of compiler optimizations, which optimize the layout of instructions in memory. T...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
Modern processors apply sophisticated techniques, such as deep cache hierarchies and hardware prefet...
In this paper we address the important problem of instruc-tion fetch for future wide issue superscal...
Modern many-core programmable accelerators are often composed by several computing units grouped in ...
The performance of a computing system heavily depends on the memory hierarchy. Fast but expensive ca...
Truly incremental development is a holy grail of verification-intensive software industry. All facto...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
This paper describes a General-Purpose Software cache (GPS cache) which can improve the performance ...
This research aims to explore possible solutions to improvementof performance in multimedia processo...
Contention for shared cache resources has been recognized as a major bottleneck for multicores—espec...
Ease of programming is one of the main impediments for the broad acceptance of multi-core systems wi...
Ease of programming is one of the main impediments for the broad acceptance of multi-core systems wi...
Cache becomes very important in high-load computer application. In a web application, cache can impr...
The memory system remains a major performance bottleneck in modern and future architectures. In this...
We explore the use of compiler optimizations, which optimize the layout of instructions in memory. T...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
Modern processors apply sophisticated techniques, such as deep cache hierarchies and hardware prefet...
In this paper we address the important problem of instruc-tion fetch for future wide issue superscal...
Modern many-core programmable accelerators are often composed by several computing units grouped in ...
The performance of a computing system heavily depends on the memory hierarchy. Fast but expensive ca...
Truly incremental development is a holy grail of verification-intensive software industry. All facto...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
This paper describes a General-Purpose Software cache (GPS cache) which can improve the performance ...
This research aims to explore possible solutions to improvementof performance in multimedia processo...
Contention for shared cache resources has been recognized as a major bottleneck for multicores—espec...
Ease of programming is one of the main impediments for the broad acceptance of multi-core systems wi...
Ease of programming is one of the main impediments for the broad acceptance of multi-core systems wi...
Cache becomes very important in high-load computer application. In a web application, cache can impr...
The memory system remains a major performance bottleneck in modern and future architectures. In this...