Abstract. An address decoder is a small hardware unit that uses an address to index and place the data into memory units including cache memories. In current CPU cache designs there is a single decoder unit which serves to place data into the cache. In this paper we describe a technique to reduce contention on CPU’s caches through the use of multiple address decoders. We argue that by using multiple decoding techniques better data placement can be achieved and the CPU cache can be better utilized. We present an overview of an instrumentation tool developed to collect fine-grained data traces and a technique for virtually splitting caches using separate address decoders. Our results demonstrate the feasibility and the impact of virtual cache...
On-chip caches to reduce average memory access latency are commonplace in today\u27s commercial micr...
We present a technique to increase data cache utilization of pointer-based programs. These caches ar...
This paper presents a technique for minimizing chip-area cost of implementing an on-chip cache memor...
This dissertation presents a hardware accelerator that is able to accelerate large (including non-pa...
The mapping of the physical address space to actual physical locations in DRAM is a complex multista...
textThis dissertation explores techniques for reducing the costs of inter-processor communication i...
The gap between CPU and main memory speeds has long been a performance bottleneck. As we move toward...
Abstract. Cache attacks, which exploit differences in timing to perform covert or side channels, are...
The widening gap between processor and memory speeds renders data locality optimization a very impor...
For many programs, especially integer codes, untolerated load instruction latencies account for a si...
Address correlation is a technique that links the addresses that reference the same data values. Usi...
Designers typically add design margins to memories to compensate for their aging. As the aging impac...
Trace-driven simulation is an important aid in performance analysis of computer systems. Capturing a...
The gap between processor speed and memory latency has led to the use of caches in the memory system...
This paper presents a technique for minimizing chip-area cost of implementing an on-chip cache memor...
On-chip caches to reduce average memory access latency are commonplace in today\u27s commercial micr...
We present a technique to increase data cache utilization of pointer-based programs. These caches ar...
This paper presents a technique for minimizing chip-area cost of implementing an on-chip cache memor...
This dissertation presents a hardware accelerator that is able to accelerate large (including non-pa...
The mapping of the physical address space to actual physical locations in DRAM is a complex multista...
textThis dissertation explores techniques for reducing the costs of inter-processor communication i...
The gap between CPU and main memory speeds has long been a performance bottleneck. As we move toward...
Abstract. Cache attacks, which exploit differences in timing to perform covert or side channels, are...
The widening gap between processor and memory speeds renders data locality optimization a very impor...
For many programs, especially integer codes, untolerated load instruction latencies account for a si...
Address correlation is a technique that links the addresses that reference the same data values. Usi...
Designers typically add design margins to memories to compensate for their aging. As the aging impac...
Trace-driven simulation is an important aid in performance analysis of computer systems. Capturing a...
The gap between processor speed and memory latency has led to the use of caches in the memory system...
This paper presents a technique for minimizing chip-area cost of implementing an on-chip cache memor...
On-chip caches to reduce average memory access latency are commonplace in today\u27s commercial micr...
We present a technique to increase data cache utilization of pointer-based programs. These caches ar...
This paper presents a technique for minimizing chip-area cost of implementing an on-chip cache memor...