Many contemporary applications feature multi-megabyte instruction footprints that overwhelm the capacity of branch target buffers (BTB) and instruction caches (L1-I), causing frequent front-end stalls that inevitably hurt performance. BTB is crucial for performance as it enables the front-end to accurately resolve the upcoming execution path and steer instruction fetch appropriately. Moreover, it also enables highly effective fetch-directed instruction prefetching that can eliminate many L1-I misses. For these reasons, commercial processors allocate vast amounts of storage capacity to BTBs. This letter aims to reduce BTB storage requirements by optimizing the organization of BTB entries. Our key insight is that today's BTBs store the full t...
The effort to reduce address translation overheads has typically targeted data accesses since they c...
To maximize the performance of a wide-issue superscalar processor, the fetch mechanism must be capab...
Modern processors rely heavily on speculation to provide performance. Techniques such as branch pred...
Achieving high instruction issue rates depends on the ability to dynamically predict branches. We co...
Most newly announced microprocessors manipulate 64-bit virtual addresses and the width of physical a...
Accurate branch prediction is critical to performance; mispredicted branches mean that ten’s of cycl...
Processor architectures will increasingly rely on issuing multiple instructions to make full use of ...
The continually increasing speed of microprocessors stresses the need for ever faster instruction fe...
Most newly announced high erformance micro ro-/ {cessors sup ort 64-bit virtual ad resses and the wi...
The potential performance of superscalar processors can be exploited only when processor is fed with...
As more and more query processing work can be done in main memory, memory access is becoming a signi...
Energy efficiency is a first-order design goal for nearly all classes of processors, but it is parti...
Abstract-In this paper, we propose an alternative BTB design, called lazy BTB, to reduce the BTB ene...
We explore the use of compiler optimizations, which optimize the layout of instructions in memory. T...
In the pursuit of instruction-level parallelism, significant demands are placed on a processor's ins...
The effort to reduce address translation overheads has typically targeted data accesses since they c...
To maximize the performance of a wide-issue superscalar processor, the fetch mechanism must be capab...
Modern processors rely heavily on speculation to provide performance. Techniques such as branch pred...
Achieving high instruction issue rates depends on the ability to dynamically predict branches. We co...
Most newly announced microprocessors manipulate 64-bit virtual addresses and the width of physical a...
Accurate branch prediction is critical to performance; mispredicted branches mean that ten’s of cycl...
Processor architectures will increasingly rely on issuing multiple instructions to make full use of ...
The continually increasing speed of microprocessors stresses the need for ever faster instruction fe...
Most newly announced high erformance micro ro-/ {cessors sup ort 64-bit virtual ad resses and the wi...
The potential performance of superscalar processors can be exploited only when processor is fed with...
As more and more query processing work can be done in main memory, memory access is becoming a signi...
Energy efficiency is a first-order design goal for nearly all classes of processors, but it is parti...
Abstract-In this paper, we propose an alternative BTB design, called lazy BTB, to reduce the BTB ene...
We explore the use of compiler optimizations, which optimize the layout of instructions in memory. T...
In the pursuit of instruction-level parallelism, significant demands are placed on a processor's ins...
The effort to reduce address translation overheads has typically targeted data accesses since they c...
To maximize the performance of a wide-issue superscalar processor, the fetch mechanism must be capab...
Modern processors rely heavily on speculation to provide performance. Techniques such as branch pred...