Automatic parallelizing compilers have evolved greatly over the last decade. Tools like Pluto, Par4All and PPCG are widely adopted to generate optimized OpenMP, CUDA and OpenCL codes from input serial codes. However, in the end, it is the programmer\u27s responsibility to select the best target architecture for a particular application depending on constraints of time or energy. In this dissertation we describe a software feature centric approach to select the architecture that will execute the fastest architecture to run a generated parallel code on between two devices attached to a heterogeneous compute node. Recognizing the importance energy aware computing is gaining, we extend our work to select the most energy efficient device to run ...
In the early 2000s, the superscalar CPU paradigm reached the point of diminishing returns mainly due...
Because of tight power and energy constraints, industry is progressively shifting toward heterogeneo...
As the demand increases for high performance and power efficiency in modern computer runtime systems...
The next-generation of supercomputers will feature a diverse mix of accelerator devices. The increas...
The rising pressure to simultaneously improve performance and reduce power consumption is driving mo...
Cavazos, JohnAs the high-performance computing (HPC) community continues the push towards exascale ...
To help shrink the programmability-performance efficiency gap, we discuss that adaptive runtime syst...
Abstract—Big and complex applications need many resources and long computation time to execute seque...
Thanks to parallel processing, it is possible not only to reduce code runtime but also energy consum...
This paper studies the performance and energy consumption of several multi-core, multi-CPUs and many...
Energy consumption is one of the top challenges for achieving the next generation of supercomputing....
Nowadays, reducing energy consumption and improving the energy efficiency of computing systems becom...
Thanks to parallel processing, it is possible not only to reduce code runtime but also energy consum...
Energy consumption by computer systems has emerged as an important concern, both at the level of ind...
Energy consumption is a major concern with high performance multicore systems. In this paper, we exp...
In the early 2000s, the superscalar CPU paradigm reached the point of diminishing returns mainly due...
Because of tight power and energy constraints, industry is progressively shifting toward heterogeneo...
As the demand increases for high performance and power efficiency in modern computer runtime systems...
The next-generation of supercomputers will feature a diverse mix of accelerator devices. The increas...
The rising pressure to simultaneously improve performance and reduce power consumption is driving mo...
Cavazos, JohnAs the high-performance computing (HPC) community continues the push towards exascale ...
To help shrink the programmability-performance efficiency gap, we discuss that adaptive runtime syst...
Abstract—Big and complex applications need many resources and long computation time to execute seque...
Thanks to parallel processing, it is possible not only to reduce code runtime but also energy consum...
This paper studies the performance and energy consumption of several multi-core, multi-CPUs and many...
Energy consumption is one of the top challenges for achieving the next generation of supercomputing....
Nowadays, reducing energy consumption and improving the energy efficiency of computing systems becom...
Thanks to parallel processing, it is possible not only to reduce code runtime but also energy consum...
Energy consumption by computer systems has emerged as an important concern, both at the level of ind...
Energy consumption is a major concern with high performance multicore systems. In this paper, we exp...
In the early 2000s, the superscalar CPU paradigm reached the point of diminishing returns mainly due...
Because of tight power and energy constraints, industry is progressively shifting toward heterogeneo...
As the demand increases for high performance and power efficiency in modern computer runtime systems...