Continuing advances in heterogeneous and parallel computing en- able massive performance gains in domains such as AI and HPC. Such gains often involve using hardware accelerators, such as FP- GAs and GPUs, to speed up specific workloads. However, to make effective use of emerging heterogeneous architectures, optimisa- tion is typically done manually by highly-skilled developers with in-depth understanding of the target hardware. The process is te- dious, error-prone, and must be repeated for each new application. This paper introduces Design-Flow Patterns, which capture modular, recurring application-agnostic elements involved in mapping and optimising application descriptions onto efficient CPU and GPU targets. Our approach is the first to...
This paper presents a new technique for introducing and tuning parallelism for heterogeneous shared-...
Design optimization relies heavily on time-consuming simulations, especially when using gradient-fre...
This dissertation focuses on efficient generation of custom processors from high-level language desc...
In today's increasingly heterogeneous compute landscape, there is high demand for design tools that ...
This paper provides a novel compilation approach that addresses the complexity of mapping high-level...
Graphics Processing Units (GPUs) are now commonplace in computing systems and are the most successf...
Funding: This work has been supported by the European Union Framework 7 grant IST-2011-288570 “ParaP...
Heterogeneous computer systems are ubiquitous in all areas of computing, from mobile to high-perfor...
Metaheuristics have been showing interesting results in solving hard optimization problems. However,...
This thesis is written in EnglishReal-world optimization problems are often complex and NP-hard. The...
We propose a design methodology to facilitate rigorous development of complex applications targeting...
With the quickly evolving hardware landscape of high-performance computing (HPC) and its increasing ...
The increasing heterogeneity of computing systems enables higher performance and power efficiency. H...
International audienceThe quality of compiler-optimized code for high-performance applications lags ...
Manual tuning of applications for heterogeneous parallel systems is tedious and complex. Optimizati...
This paper presents a new technique for introducing and tuning parallelism for heterogeneous shared-...
Design optimization relies heavily on time-consuming simulations, especially when using gradient-fre...
This dissertation focuses on efficient generation of custom processors from high-level language desc...
In today's increasingly heterogeneous compute landscape, there is high demand for design tools that ...
This paper provides a novel compilation approach that addresses the complexity of mapping high-level...
Graphics Processing Units (GPUs) are now commonplace in computing systems and are the most successf...
Funding: This work has been supported by the European Union Framework 7 grant IST-2011-288570 “ParaP...
Heterogeneous computer systems are ubiquitous in all areas of computing, from mobile to high-perfor...
Metaheuristics have been showing interesting results in solving hard optimization problems. However,...
This thesis is written in EnglishReal-world optimization problems are often complex and NP-hard. The...
We propose a design methodology to facilitate rigorous development of complex applications targeting...
With the quickly evolving hardware landscape of high-performance computing (HPC) and its increasing ...
The increasing heterogeneity of computing systems enables higher performance and power efficiency. H...
International audienceThe quality of compiler-optimized code for high-performance applications lags ...
Manual tuning of applications for heterogeneous parallel systems is tedious and complex. Optimizati...
This paper presents a new technique for introducing and tuning parallelism for heterogeneous shared-...
Design optimization relies heavily on time-consuming simulations, especially when using gradient-fre...
This dissertation focuses on efficient generation of custom processors from high-level language desc...