Abstract—In most embedded and general purpose archi-tectures, stack data and non-stack data is cached together, meaning that writing to or loading from the stack may expel non-stack data from the data cache. Manipulation of the stack has a different memory access pattern than that of non-stack data, showing higher temporal and spatial locality. We propose caching stack and non-stack data separately and develop four different stack caches that allow this separation without requiring compiler support. These are the simple, window, and prefilling with and without tag stack caches. The performance of the stack cache architectures was evaluated using the SimpleScalar toolset where the window and prefilling stack cache without tag resulted in an ...
Abstract—This paper analyzes the trade-offs in architecting stacked DRAM either as part of main memo...
Hardware trends have produced an increasing disparity between processor speeds and memory access tim...
grantor: University of TorontoThe latency of accessing instructions and data from the memo...
Treating data based on its location in memory has received much attention in recent years due to its...
The gap between processor and memory speed appears as a serious bottleneck in improving the performa...
International audienceThe design of tailored hardware has proven a successful strategy to reduce the...
International audience<p>The growing complexity of modern computer architectures increasingly compli...
The widening gap between the processor clock speed and the memory latency puts an added pressure on ...
The gap between CPU and main memory speeds has long been a performance bottleneck. As we move toward...
Current split data caches classify data as having either spatial locality or temporal locality. The...
During the last two decades, the performance of CPU has been developed much faster than that of memo...
Abstract—Real-time systems need time-predictable architec-tures to support static worst-case executi...
We present a technique to increase data cache utilization of pointer-based programs. These caches ar...
Abstract — As more cores (processing elements) are included in a single chip, it is likely that the ...
Contention for shared cache resources has been recognized as a major bottleneck for multicores—espec...
Abstract—This paper analyzes the trade-offs in architecting stacked DRAM either as part of main memo...
Hardware trends have produced an increasing disparity between processor speeds and memory access tim...
grantor: University of TorontoThe latency of accessing instructions and data from the memo...
Treating data based on its location in memory has received much attention in recent years due to its...
The gap between processor and memory speed appears as a serious bottleneck in improving the performa...
International audienceThe design of tailored hardware has proven a successful strategy to reduce the...
International audience<p>The growing complexity of modern computer architectures increasingly compli...
The widening gap between the processor clock speed and the memory latency puts an added pressure on ...
The gap between CPU and main memory speeds has long been a performance bottleneck. As we move toward...
Current split data caches classify data as having either spatial locality or temporal locality. The...
During the last two decades, the performance of CPU has been developed much faster than that of memo...
Abstract—Real-time systems need time-predictable architec-tures to support static worst-case executi...
We present a technique to increase data cache utilization of pointer-based programs. These caches ar...
Abstract — As more cores (processing elements) are included in a single chip, it is likely that the ...
Contention for shared cache resources has been recognized as a major bottleneck for multicores—espec...
Abstract—This paper analyzes the trade-offs in architecting stacked DRAM either as part of main memo...
Hardware trends have produced an increasing disparity between processor speeds and memory access tim...
grantor: University of TorontoThe latency of accessing instructions and data from the memo...