Abstract Caches are widely used to reduce the speed gap between processors and memories. However, the spatial locality of sequential data accesses existing in many popular applications is not well exploited by conventional data cache. In response to these problems, the Split Sequential Data Cache (SSDC) is proposed, in which the sequential access detector can predict whether data accesses are sequential, and direct them to the right sub cache. Experiments show that the SSDC outperforms the conventional data cache and other schemes. It reduces the miss rate of applications with intensive sequential data accesses with only a little increment of bandwidth requirement. Meanwhile, the experimental results on SPEC2000Int show that SSDC does not h...
Abstract. Future embedded systems are expected to use chip-multiprocessors to provide the execution ...
The performance of superscalar processors is more sensitive to the memory system delay than their si...
Multiprocessors with shared memory are considered more general and easier to program than message-pa...
Caches are widely used to reduce the speed gap between processors and memories. However, the spatial...
高速缓存器技术可以有效地弥补处理器和内存之间的速度差异;但是随着待处理的数据规模的增大,顺序数据访问越来越多,当前的高速缓存器在面临这类没有太多时间局部性,会造成大量高速缓存器污染的顺序数据时并不是很...
During the last two decades, the performance of CPU has been developed much faster than that of memo...
The goal of cache design is to exploit data localities; however, the means to this end vary widely a...
In recent years, CPU performance has become energy constrained. If performance is to continue increa...
This paper shows that even very small reconfigurable data caches, when split to serve data streams ...
Treating data based on its location in memory has received much attention in recent years due to its...
Current split data caches classify data as having either spatial locality or temporal locality. The...
The speed of processors increases much faster than the memory access time. This makes memory accesse...
In this paper we show that partitioning data cache into array and scalar caches can improve cache ac...
On-chip caches to reduce average memory access latency are commonplace in today\u27s commercial micr...
The widening gap between the processor clock speed and the memory latency puts an added pressure on ...
Abstract. Future embedded systems are expected to use chip-multiprocessors to provide the execution ...
The performance of superscalar processors is more sensitive to the memory system delay than their si...
Multiprocessors with shared memory are considered more general and easier to program than message-pa...
Caches are widely used to reduce the speed gap between processors and memories. However, the spatial...
高速缓存器技术可以有效地弥补处理器和内存之间的速度差异;但是随着待处理的数据规模的增大,顺序数据访问越来越多,当前的高速缓存器在面临这类没有太多时间局部性,会造成大量高速缓存器污染的顺序数据时并不是很...
During the last two decades, the performance of CPU has been developed much faster than that of memo...
The goal of cache design is to exploit data localities; however, the means to this end vary widely a...
In recent years, CPU performance has become energy constrained. If performance is to continue increa...
This paper shows that even very small reconfigurable data caches, when split to serve data streams ...
Treating data based on its location in memory has received much attention in recent years due to its...
Current split data caches classify data as having either spatial locality or temporal locality. The...
The speed of processors increases much faster than the memory access time. This makes memory accesse...
In this paper we show that partitioning data cache into array and scalar caches can improve cache ac...
On-chip caches to reduce average memory access latency are commonplace in today\u27s commercial micr...
The widening gap between the processor clock speed and the memory latency puts an added pressure on ...
Abstract. Future embedded systems are expected to use chip-multiprocessors to provide the execution ...
The performance of superscalar processors is more sensitive to the memory system delay than their si...
Multiprocessors with shared memory are considered more general and easier to program than message-pa...