Advances in IC technology increase the integration density for higher clock rates and provide more opportunities for microprocessor design. In this paper, we propose a new paradigm to exploit instruction-level parallelism (ILP) across multiple superscalar processors on a single chip by taking advantages of both VLIW-style static scheduling techniques and dynamic scheduling of superscalar architecture. In the proposed paradigm, ILP is exploited by a compiler from a sequential program and this VLIW-like-parallelized code is further parallelized by 2-way superscalar engines at run-time. Superscalar processors are connected by an aggregate function network, which can enforce the necessary static timing constraints and provide appropriate inter-...
To achieve high performance, contemporary computer systems rely on two forms of parallelism: instruc...
Statically scheduled processors are known to enable low complexity hardware implementations that lea...
We introduce explicit multi-threading (XMT), a decentralized architecture that exploits fine-grained...
We present a technique for ameliorating the detrimental impact of the true data dependencies that ul...
CMOS technology scaling poses challenges in designing dynamically scheduled cores that can sustain b...
Extensive research as been done on extracting parallelism from single instruction stream processors....
A common approach to enhance the performance of processors is to increase the number of function uni...
Superscalar and VLIW processors can both execute multiple instructions each cycle. Each employs a di...
dataflow processors, superscalar processors, instruction scheduling, trace scheduling, software pipe...
Abstract. A compiler for VLIW and superscalar processors must expose sufficient instruction-level pa...
Recent high performance processors have depended on Instruction Level Parallelism (ILP) to achieve h...
A great deal of the current research into computer architecture is directed at Multiple Instruction ...
To achieve high performance, contemporary computer systems rely on two forms of parallelism: instruc...
Advances in VLSI technology will enable chips with over a billion transistors within the next decade...
High-performance, general-purpose microprocessors serve as compute engines for computers ranging fro...
To achieve high performance, contemporary computer systems rely on two forms of parallelism: instruc...
Statically scheduled processors are known to enable low complexity hardware implementations that lea...
We introduce explicit multi-threading (XMT), a decentralized architecture that exploits fine-grained...
We present a technique for ameliorating the detrimental impact of the true data dependencies that ul...
CMOS technology scaling poses challenges in designing dynamically scheduled cores that can sustain b...
Extensive research as been done on extracting parallelism from single instruction stream processors....
A common approach to enhance the performance of processors is to increase the number of function uni...
Superscalar and VLIW processors can both execute multiple instructions each cycle. Each employs a di...
dataflow processors, superscalar processors, instruction scheduling, trace scheduling, software pipe...
Abstract. A compiler for VLIW and superscalar processors must expose sufficient instruction-level pa...
Recent high performance processors have depended on Instruction Level Parallelism (ILP) to achieve h...
A great deal of the current research into computer architecture is directed at Multiple Instruction ...
To achieve high performance, contemporary computer systems rely on two forms of parallelism: instruc...
Advances in VLSI technology will enable chips with over a billion transistors within the next decade...
High-performance, general-purpose microprocessors serve as compute engines for computers ranging fro...
To achieve high performance, contemporary computer systems rely on two forms of parallelism: instruc...
Statically scheduled processors are known to enable low complexity hardware implementations that lea...
We introduce explicit multi-threading (XMT), a decentralized architecture that exploits fine-grained...