textLimits on power consumption, complexity, and on-chip latency have focused computer architects on power-efficient designs that exploit parallelism. One approach divides programs into atomic blocks of operations that execute semi-independently, which efficiently creates a large window of potentially concurrent operations. This dissertation studies the intertwined roles of the compiler, architecture, and microarchitecture in achieving efficiency and high performance with a block-atomic architecture. For such an architecture to achieve high performance the compiler must form blocks effectively. The compiler must create large blocks of instructions to amortize the per-block overhead, but control flow and content restrictions...
To exploit larger amounts of instruction level parallelism, processors are being built with wider is...
High performance superscalar microarchitectures exploit instruction-level parallelism (ILP) to impro...
Performance bounds represent the best achievable performance that can be delivered by target microar...
textLimits on power consumption, complexity, and on-chip latency have focused computer architects o...
textTechnology trends such as growing wire delays, power consumption limits, and diminishing clock r...
Explicit Data Graph Execution (EDGE) architectures offer the possibility of high instruction-level p...
This paper makes two new observations that lead to a new heterogeneous core design. First, we observ...
To exploit larger amounts of instruction level parallelism, processors are being built with wider is...
Parallelizing compiler technology has improved in re-cent years. One area in which compilers have ma...
We deal with compiler support for parallelizing perfectly nested loops for coarse-grain distributed ...
Atomic blocks allow programmers to delimit sections of code as ‘atomic’, leaving the language’s impl...
The increasing density of VLSI circuits has motivated research into ways to utilize large area budge...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/16...
The end of Dennard scaling also brought an end to frequency scaling as a means to improve performanc...
The trend in high-performance microprocessor design is toward increasing computational power on the ...
To exploit larger amounts of instruction level parallelism, processors are being built with wider is...
High performance superscalar microarchitectures exploit instruction-level parallelism (ILP) to impro...
Performance bounds represent the best achievable performance that can be delivered by target microar...
textLimits on power consumption, complexity, and on-chip latency have focused computer architects o...
textTechnology trends such as growing wire delays, power consumption limits, and diminishing clock r...
Explicit Data Graph Execution (EDGE) architectures offer the possibility of high instruction-level p...
This paper makes two new observations that lead to a new heterogeneous core design. First, we observ...
To exploit larger amounts of instruction level parallelism, processors are being built with wider is...
Parallelizing compiler technology has improved in re-cent years. One area in which compilers have ma...
We deal with compiler support for parallelizing perfectly nested loops for coarse-grain distributed ...
Atomic blocks allow programmers to delimit sections of code as ‘atomic’, leaving the language’s impl...
The increasing density of VLSI circuits has motivated research into ways to utilize large area budge...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/16...
The end of Dennard scaling also brought an end to frequency scaling as a means to improve performanc...
The trend in high-performance microprocessor design is toward increasing computational power on the ...
To exploit larger amounts of instruction level parallelism, processors are being built with wider is...
High performance superscalar microarchitectures exploit instruction-level parallelism (ILP) to impro...
Performance bounds represent the best achievable performance that can be delivered by target microar...