In this paper, we discuss techniques to transform sequential programs to texture/surface memory optimized CUDA programs. We achieve this by using PPCG, an automatic paral- lelizing compiler based on the Polyhedral model. We implemented a static analysis in PPCG which validates the semantics of the texturized transformed program. Depending on the results of the analysis, our algorithm chooses to use texture and/or surface memory, and alters the Abstract Syntax Tree accordingly. We also modified the code-generation phase of PPCG to take care of various subtleties. We evaluated the texturization algorithm on the PolyBench (4.2.1 beta) benchmark and observed up to 1.6x speedup with a geometric mean of 1.103X. The title and at ma...
The polyhedral model is known to be a powerful framework to reason about high level loop transformat...
High-level loop transformations change the order in which basic computations in a program are execut...
Polyhedral optimization can parallelize nested affine loops for high-level synthesis (HLS), but poly...
International audienceAutomatic parallelization is becoming more important as parallelism becomes ub...
Polyhedral compilation has been successful in analyzing, optimizing, automatically parallelizing a�...
Selected for presentation at the HiPEAC 2013 Conf.International audienceThis paper addresses the com...
Although Single Instruction Multiple Data (SIMD) units are available in general purpose processors a...
The polyhedral model for loop parallelization has proved to be an effective tool for ad-vanced optim...
Computers become increasingly complex. Current and future systems feature configurable hardware, mul...
On modern architectures, a missed optimization can translate into performance degradations reaching ...
This thesis proposes new extensions to the code generation phase in polyhedral compilers. The main f...
6 pagesInternational audienceParallel and heterogeneous computing are growing in audience thanks to ...
International audienceThe polyhedral model is a powerful framework for automatic optimization and pa...
International audienceTiling is a key technology to increase data reuse in computation kernels. For ...
2013 Spring.Includes bibliographical references.With the introduction of multi-core processors, moti...
The polyhedral model is known to be a powerful framework to reason about high level loop transformat...
High-level loop transformations change the order in which basic computations in a program are execut...
Polyhedral optimization can parallelize nested affine loops for high-level synthesis (HLS), but poly...
International audienceAutomatic parallelization is becoming more important as parallelism becomes ub...
Polyhedral compilation has been successful in analyzing, optimizing, automatically parallelizing a�...
Selected for presentation at the HiPEAC 2013 Conf.International audienceThis paper addresses the com...
Although Single Instruction Multiple Data (SIMD) units are available in general purpose processors a...
The polyhedral model for loop parallelization has proved to be an effective tool for ad-vanced optim...
Computers become increasingly complex. Current and future systems feature configurable hardware, mul...
On modern architectures, a missed optimization can translate into performance degradations reaching ...
This thesis proposes new extensions to the code generation phase in polyhedral compilers. The main f...
6 pagesInternational audienceParallel and heterogeneous computing are growing in audience thanks to ...
International audienceThe polyhedral model is a powerful framework for automatic optimization and pa...
International audienceTiling is a key technology to increase data reuse in computation kernels. For ...
2013 Spring.Includes bibliographical references.With the introduction of multi-core processors, moti...
The polyhedral model is known to be a powerful framework to reason about high level loop transformat...
High-level loop transformations change the order in which basic computations in a program are execut...
Polyhedral optimization can parallelize nested affine loops for high-level synthesis (HLS), but poly...