Hardware accelerators are an energy efficient alternative to general purpose processors for specific program regions. They have relied on the compiler to extract instruction level parallelism but may waste significant energy in memory disambiguation and discovering memory level parallelism (MLP). Currently, accelerators either i) Define the problem away, and rely on massively parallel programming models [1, 48] to extract MLP. ii) Reuse the Out of Order (OoO) processor [7, 28], and rely on power hungry load-store queues (LSQs) for memory disambiguation, or iii) Serialize – some accelerators [47] focus on program regions where MLP is not important and simply serialize memory operations. We present NACHOS, a compiler assisted energy efficient...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/16...
High-performance architectures rely upon powerful optimizing and parallelizing compilers to maximize...
The multicore era has initiated a move to ubiquitous parallelization of software. In the process, co...
International audienceMemory disambiguation mechanisms, coupled with load/store queues in out-of-ord...
Abstract. Alias analysis, traditionally performed statically, is unsuited for a dynamic binary trans...
In this paper, an implementation of a demand-driven alias analysis [7] in Open64 is presented. In th...
The trend in high-performance microprocessor design is toward increasing computational power on the ...
One of the main performance bottlenecks of processors today is the discrepancy between processor and...
As we witness the breakdown of Dennard scaling, we can no longer get faster computers by shrinking t...
Memory bandwidth has become the performance bottleneck for memory intensive programs on modern proce...
The world needs special-purpose accelerators to meet future constraints on computation and power con...
Program redundancy analysis and optimization have been an important component in optimizing compiler...
Parallelising compilers try to automatically convert sequential programs into parallel programs to b...
Parallelising compilers try to automatically convert sequential programs into parallel programs to b...
There is a trend towards using accelerators to increase performance and energy efficiency of general...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/16...
High-performance architectures rely upon powerful optimizing and parallelizing compilers to maximize...
The multicore era has initiated a move to ubiquitous parallelization of software. In the process, co...
International audienceMemory disambiguation mechanisms, coupled with load/store queues in out-of-ord...
Abstract. Alias analysis, traditionally performed statically, is unsuited for a dynamic binary trans...
In this paper, an implementation of a demand-driven alias analysis [7] in Open64 is presented. In th...
The trend in high-performance microprocessor design is toward increasing computational power on the ...
One of the main performance bottlenecks of processors today is the discrepancy between processor and...
As we witness the breakdown of Dennard scaling, we can no longer get faster computers by shrinking t...
Memory bandwidth has become the performance bottleneck for memory intensive programs on modern proce...
The world needs special-purpose accelerators to meet future constraints on computation and power con...
Program redundancy analysis and optimization have been an important component in optimizing compiler...
Parallelising compilers try to automatically convert sequential programs into parallel programs to b...
Parallelising compilers try to automatically convert sequential programs into parallel programs to b...
There is a trend towards using accelerators to increase performance and energy efficiency of general...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/16...
High-performance architectures rely upon powerful optimizing and parallelizing compilers to maximize...
The multicore era has initiated a move to ubiquitous parallelization of software. In the process, co...