Fetch engine performance is a key topic in superscalar processors, since it limits the instructionlevel parallelism that can be exploited by the execution core. In the search of high performance, the fetch engine has evolved toward more efficient designs, but its complexity has also increased. In this paper, we present the stream fetch engine, a novel architecture based on the execution of long streams of sequential instructions, taking maximum advantage of code layout optimizations. We describe our design in detail, showing that it achieves high fetch performance, while requiring less complexity than other state-of-the-art fetch architectures
In the pursuit of instruction-level parallelism, significant demands are placed on a processor's ins...
To maximize the performance of a wide-issue superscalar processor, the fetch mechanism must be capab...
In superscalar processors, capable of issuing and executing multiple instructions per cycle, fetch p...
Fetch performance is a very important factor because it effectively limits the overall processor per...
The design of higher performance processors has been following two major trends: increasing the pipe...
The stream fetch engine is a high-performance fetch architecture based on the concept of an instruct...
The design of higher performance processors has been following two major trends: increasing the pipe...
Fetch engine performance is seriously limited by the branch prediction table access latency. This fa...
The effective performance of wide-issue superscalar processors depends on many parameters, such as b...
Despite the extensive deployment of multi-core architectures in the past few years, the design and o...
In the pursuit of instruction-level parallelism, significant demands are placed on a processor’s ins...
Simultaneous multithreading (SMT) is an architectural technique that allows for the parallel executi...
This work presents several techniques for enlarging instruction streams. We call stream to a sequenc...
We explore the use of compiler optimizations, which optimize the layout of instructions in memory. T...
A superscalar processor supporting speculative ex-ecution requires an instruction fetch mechanism th...
In the pursuit of instruction-level parallelism, significant demands are placed on a processor's ins...
To maximize the performance of a wide-issue superscalar processor, the fetch mechanism must be capab...
In superscalar processors, capable of issuing and executing multiple instructions per cycle, fetch p...
Fetch performance is a very important factor because it effectively limits the overall processor per...
The design of higher performance processors has been following two major trends: increasing the pipe...
The stream fetch engine is a high-performance fetch architecture based on the concept of an instruct...
The design of higher performance processors has been following two major trends: increasing the pipe...
Fetch engine performance is seriously limited by the branch prediction table access latency. This fa...
The effective performance of wide-issue superscalar processors depends on many parameters, such as b...
Despite the extensive deployment of multi-core architectures in the past few years, the design and o...
In the pursuit of instruction-level parallelism, significant demands are placed on a processor’s ins...
Simultaneous multithreading (SMT) is an architectural technique that allows for the parallel executi...
This work presents several techniques for enlarging instruction streams. We call stream to a sequenc...
We explore the use of compiler optimizations, which optimize the layout of instructions in memory. T...
A superscalar processor supporting speculative ex-ecution requires an instruction fetch mechanism th...
In the pursuit of instruction-level parallelism, significant demands are placed on a processor's ins...
To maximize the performance of a wide-issue superscalar processor, the fetch mechanism must be capab...
In superscalar processors, capable of issuing and executing multiple instructions per cycle, fetch p...