Abstract – Here we present an architecture for improving data cache miss rate. Our enhancement seeks to capture greater temporal locality than a standard cache by more efficiently using the area available. Frequently used data ought to remain in the cache, while infrequently used data ought not to be admitted into the cache. Policing entry into the cache leaves more room for useful data. The Selective Fill Data Cache prevents rarely used data from entering the cache. The Cache Fill Policy keeps record of data blocks that exhibit little temporal locality. These data blocks must bypass the L1 cache before reaching the CPU. It is in the bypass path where we cache these values in a bypass buffer in order to decrease the penalty for mis-predicti...
FPGAs rely on massive datapath parallelism to accelerate applications even with a low clock frequenc...
As processors continue to deliver higher levels of performance and as memory latency tolerance techn...
This paper proposes an optimization by an alternative approach to memory mapping. Caches with low se...
Modern cache designs exploit spatial locality by fetching large blocks of data called cache lines on...
The performance of cache memories relies on the locality exhibited by programs. Traditionally this l...
The full text of this article is not available on SOAR. WSU users can access the article via IEEE Xp...
In traditional cache-based computers, all memory references are made through cache. However, a signi...
Classic cache replacement policies assume that miss costs are uniform. However, the correlation betw...
Distinguishing transient blocks from frequently used blocks enables servicing references to transien...
Improving cache performance requires understanding cache behavior. However, measuring cache performa...
The speed of processors increases much faster than the memory access time. This makes memory accesse...
As the performance gap between the processor cores and the memory subsystem increases, designers are...
As buffer cache is used to overcome the speed gap between processor and storage devices, performance...
Caches mitigate the long memory latency that limits the performance of modern processors. However, c...
While data filter caches (DFCs) have been shown to be effective at reducing data access energy, they...
FPGAs rely on massive datapath parallelism to accelerate applications even with a low clock frequenc...
As processors continue to deliver higher levels of performance and as memory latency tolerance techn...
This paper proposes an optimization by an alternative approach to memory mapping. Caches with low se...
Modern cache designs exploit spatial locality by fetching large blocks of data called cache lines on...
The performance of cache memories relies on the locality exhibited by programs. Traditionally this l...
The full text of this article is not available on SOAR. WSU users can access the article via IEEE Xp...
In traditional cache-based computers, all memory references are made through cache. However, a signi...
Classic cache replacement policies assume that miss costs are uniform. However, the correlation betw...
Distinguishing transient blocks from frequently used blocks enables servicing references to transien...
Improving cache performance requires understanding cache behavior. However, measuring cache performa...
The speed of processors increases much faster than the memory access time. This makes memory accesse...
As the performance gap between the processor cores and the memory subsystem increases, designers are...
As buffer cache is used to overcome the speed gap between processor and storage devices, performance...
Caches mitigate the long memory latency that limits the performance of modern processors. However, c...
While data filter caches (DFCs) have been shown to be effective at reducing data access energy, they...
FPGAs rely on massive datapath parallelism to accelerate applications even with a low clock frequenc...
As processors continue to deliver higher levels of performance and as memory latency tolerance techn...
This paper proposes an optimization by an alternative approach to memory mapping. Caches with low se...