International audienceIt is often hard to predict the performance of a statically generated code. Hardware availability, hardware specification and problem size may change from one execution context to another. The main contribution of this work is an entirely automatic method aiming to predict execution times of semantically equivalent versions of affine loop nests on GPUs; then, to run the best performing one on GPU or CPU. To make accurate predictions, our framework relies on three consecutive stages: a static code generation, an offline profiling and an online prediction. Different versions are statically gen- erated by PPCG, a source-to-source polyhedral compiler, able to generate CUDA code from static control loops written in C. The c...
The relentless demands for improvements in the compute throughput, and energy efficiency have driven...
Modern Graphic Processing Units (GPUs) offer significant performance speedup over conventional proce...
We propose a generalized method for adapting and optimizing algorithms for efficient execution on mo...
International audienceIt is often hard to predict the performance of a statically generated code. Ha...
International audienceWe contribute a method to jointly use CPU and GPU in order to execute a balanc...
Technological limitations faced by the semi-conductor manufacturers in the early 2000's restricted t...
Technological limitations faced by the semi-conductor manufacturers in the early 2000's restricted t...
GPUs have become popular due to their high computational power. Data scientists rely on GPUs to proc...
GPUs have become popular due to their high computational power. Data scientists rely on GPUs to proc...
Graphics Processing Units (GPU) have been widely adopted to accelerate the execution of HPC workload...
Recent advances in GPUs (graphics processing units) lead to mas-sively parallel hardware that is eas...
A major shift in technology from maximizing single-core performance to integrating multiple cores ha...
Abstract-The massive parallelism offered by Graphics Processing Units (GPUs) is now routinely exploi...
Heterogeneous processing using GPUs is here to stay and today spans mobile devices, laptops, and ...
Selected for presentation at the HiPEAC 2013 Conf.International audienceThis paper addresses the com...
The relentless demands for improvements in the compute throughput, and energy efficiency have driven...
Modern Graphic Processing Units (GPUs) offer significant performance speedup over conventional proce...
We propose a generalized method for adapting and optimizing algorithms for efficient execution on mo...
International audienceIt is often hard to predict the performance of a statically generated code. Ha...
International audienceWe contribute a method to jointly use CPU and GPU in order to execute a balanc...
Technological limitations faced by the semi-conductor manufacturers in the early 2000's restricted t...
Technological limitations faced by the semi-conductor manufacturers in the early 2000's restricted t...
GPUs have become popular due to their high computational power. Data scientists rely on GPUs to proc...
GPUs have become popular due to their high computational power. Data scientists rely on GPUs to proc...
Graphics Processing Units (GPU) have been widely adopted to accelerate the execution of HPC workload...
Recent advances in GPUs (graphics processing units) lead to mas-sively parallel hardware that is eas...
A major shift in technology from maximizing single-core performance to integrating multiple cores ha...
Abstract-The massive parallelism offered by Graphics Processing Units (GPUs) is now routinely exploi...
Heterogeneous processing using GPUs is here to stay and today spans mobile devices, laptops, and ...
Selected for presentation at the HiPEAC 2013 Conf.International audienceThis paper addresses the com...
The relentless demands for improvements in the compute throughput, and energy efficiency have driven...
Modern Graphic Processing Units (GPUs) offer significant performance speedup over conventional proce...
We propose a generalized method for adapting and optimizing algorithms for efficient execution on mo...