The potential performance of superscalar processors can be exploited only when processor is fed with sufficient instruction bandwidth. The front-end units, the Instruction Stream Buffer (ISB) and the fetcher, are the key elements for achieving this goal. Current ISBs could not support instruction streaming beyond a basic block. In x86 processors, the split-line instruction problem worsens this situation. In this paper, we proposed a basic blocks reassembling ISB. By cooperating with the proposed Line-Weighted Branch Target Buffer (LWBTB), the proposed ISB can predict upcoming branch information and reassemble current cache line together with the other cache line containing instructions for the next basic block. Therefore, the fetcher could ...
Future processors combining out-of-order execution with aggressive speculation techniques will need ...
Accurate branch prediction is critical to performance; mispredicted branches mean that ten’s of cycl...
The need to minimize power while maximizing performance has led to recent developments of powerful s...
The stream fetch engine is a high-performance fetch architecture based on the concept of an instruct...
Achieving high instruction issue rates depends on the ability to dynamically predict branches. We co...
The continually increasing speed of microprocessors stresses the need for ever faster instruction fe...
In the pursuit of instruction-level parallelism, significant demands are placed on a processor's ins...
To exploit larger amounts of instruction level parallelism, processors are being built with wider is...
This work presents several techniques for enlarging instruction streams. We call stream to a sequenc...
Abstract—Recently, several loop buffer designs have been proposed to reduce instruction fetch energy...
Many contemporary applications feature multi-megabyte instruction footprints that overwhelm the capa...
Recent supers calar processors issue four tnstructzons per cycle. These processors are also powered ...
Fetch performance is a very important factor because it effectively limits the overall processor per...
To maximize the performance of a wide-issue superscalar processor, the fetch mechanism must be capab...
The design of higher performance processors has been following two major trends: increasing the pipe...
Future processors combining out-of-order execution with aggressive speculation techniques will need ...
Accurate branch prediction is critical to performance; mispredicted branches mean that ten’s of cycl...
The need to minimize power while maximizing performance has led to recent developments of powerful s...
The stream fetch engine is a high-performance fetch architecture based on the concept of an instruct...
Achieving high instruction issue rates depends on the ability to dynamically predict branches. We co...
The continually increasing speed of microprocessors stresses the need for ever faster instruction fe...
In the pursuit of instruction-level parallelism, significant demands are placed on a processor's ins...
To exploit larger amounts of instruction level parallelism, processors are being built with wider is...
This work presents several techniques for enlarging instruction streams. We call stream to a sequenc...
Abstract—Recently, several loop buffer designs have been proposed to reduce instruction fetch energy...
Many contemporary applications feature multi-megabyte instruction footprints that overwhelm the capa...
Recent supers calar processors issue four tnstructzons per cycle. These processors are also powered ...
Fetch performance is a very important factor because it effectively limits the overall processor per...
To maximize the performance of a wide-issue superscalar processor, the fetch mechanism must be capab...
The design of higher performance processors has been following two major trends: increasing the pipe...
Future processors combining out-of-order execution with aggressive speculation techniques will need ...
Accurate branch prediction is critical to performance; mispredicted branches mean that ten’s of cycl...
The need to minimize power while maximizing performance has led to recent developments of powerful s...