One of the main obstacles to exploiting the fine-grained parallelism that is available in general-purpose code is the frequency of branches that cause unpredictable changes in the control flow of a program at run-time. Whenever a branch is taken, a performance penalty may be incurred as the processor waits for instructions to be fetched from the branch target stream. RISC processors introduce a delayed-branch mechanism which defines branch delay slots into which code can be scheduled. This strategy allows the processor to be kept busy executing useful instructions while the change of control flow takes place. While the concept of delayed-branches can be readily extended to VLIW architectures, it is less clear how it should be incorporated i...
Superscalar and superpipelining techniques increase the overlap between the instructions in a pipeli...
The presence of branch instructions in an instruction stream may adversely affect the performance of...
Modern superscalar processors use advanced features like dynamic scheduling and speculative executio...
While delayed branch mechanisms were popular with the designers of RISC processors, most superscalar...
If a high-performance superscalar processor is to realise its full potential, the complier must re-o...
In this paper we show how to formally specify and simulate the high-level instruction timing propert...
It is increasingly accepted that superscalar processors can only achieve their full performance pote...
LaZy Superscalar is a processor architecture which delays the execution of fetched instructions unti...
Due to the character of the original source materials and the nature of batch digitization, quality ...
Although instruction scheduling is an scNP-complete problem (27), many techniques have been develope...
A mechanism to reduce the cost of branches in pipelined processors is described and evaluated. It is...
The foremost goal of superscalar processor design is to increase performance through the exploitatio...
High performance computer architectures increasingly use compile-time instruction scheduling to reor...
To characterize future performance limitations of superscalar processors, the delays of key pipeline...
As the issue width and depth of pipelining of high performance superscalar processors increase, the ...
Superscalar and superpipelining techniques increase the overlap between the instructions in a pipeli...
The presence of branch instructions in an instruction stream may adversely affect the performance of...
Modern superscalar processors use advanced features like dynamic scheduling and speculative executio...
While delayed branch mechanisms were popular with the designers of RISC processors, most superscalar...
If a high-performance superscalar processor is to realise its full potential, the complier must re-o...
In this paper we show how to formally specify and simulate the high-level instruction timing propert...
It is increasingly accepted that superscalar processors can only achieve their full performance pote...
LaZy Superscalar is a processor architecture which delays the execution of fetched instructions unti...
Due to the character of the original source materials and the nature of batch digitization, quality ...
Although instruction scheduling is an scNP-complete problem (27), many techniques have been develope...
A mechanism to reduce the cost of branches in pipelined processors is described and evaluated. It is...
The foremost goal of superscalar processor design is to increase performance through the exploitatio...
High performance computer architectures increasingly use compile-time instruction scheduling to reor...
To characterize future performance limitations of superscalar processors, the delays of key pipeline...
As the issue width and depth of pipelining of high performance superscalar processors increase, the ...
Superscalar and superpipelining techniques increase the overlap between the instructions in a pipeli...
The presence of branch instructions in an instruction stream may adversely affect the performance of...
Modern superscalar processors use advanced features like dynamic scheduling and speculative executio...