Hardware-Aware Automatic Code-Transformation to Support Compilers in Exploiting the Multi-Level Parallel Potential of Modern CPUs

Feld, Dustin
Jünger, Michael
Mallach, Sven
Soddemann, Thomas

Open link

Publication date

January 2014

DOI

10.1145/2723772.2723776

Publisher

Association for Computing Machinery (ACM)

Abstract

Modern compilers offer more and more capabilities to automatically parallelize code-regions if these match certain properties. However, there are several application kernels that, although rather simple transformations would suffice in order to make them match these properties, are either not at all parallelized by state-of-the-art compilers or could at least be improved w.r.t. their performance. This paper proposes a loop-tiling approach focusing on automatic vectorization and multi-core parallelization, with emphasis on a smart cache exploitation. The method is based on polyhedral code transformations that are applied as a pre-compilation step and it is shown to help compilers in generating more and better parallel code-regions. It automa...