In the data parallel programming style the user usually specifies the data parallelism explicitly so that the compiler can generate efficient code without enhanced analysis techniques. In some situations it is not possible to specify the parallelism explicitly or this might be not very convenient. This is especially true for loop nests with data dependences between the data of distributed dimensions. In the case of uniform loop nests there are scheduling, mapping and partitioning techniques available. Some different strategies have been considered and evaluated with existing High Performance Fortran compilation systems. This paper gives some experimental results about the performance and the benefits of the different techniques and optimiza...
In this report we address the issue of loop tiling to minimize the completion time of the loop when ...
International audienceEnabling HPC applications to perform efficiently when invoking multiple parall...
We study the computational power of rational Piecewise Constant Derivative (PCD) systems. PCD system...
It is easy to find errors and inefficient parts of a sequential program, by using a standard debugge...
In the framework of fully permutable loops, tiling has been extensively studied as a source-to-sourc...
In this paper, we survey loop parallelization algorithms, analyzing the dependence representations t...
A parallel programming archetype [Cha94, CMMM95] is an abstraction that captures the common features...
In this paper, an efficient algorithm to simultaneously implement array alignment and data/computati...
In this paper, we compare three nested loops parallelization algorithms (Allen and Kennedy's algorit...
We describe a simple data-parallel kernel language which encapsulates the main data-parallel control...
Automatic parallelization is one of the approaches aimed at a better and easier use of parallel comp...
This thesis intends to show how to efficiently exploit the parallelism present in applications in or...
This report describes three application program interfaces to BPFS, a distributed, modular parallel ...
RÉSUMÉ: L'évolution spectaculaire des technologies dans le domaine du matériel et du logiciel a perm...
We describe the compilation and execution of data-parallel languages for networks of workstations. E...
In this report we address the issue of loop tiling to minimize the completion time of the loop when ...
International audienceEnabling HPC applications to perform efficiently when invoking multiple parall...
We study the computational power of rational Piecewise Constant Derivative (PCD) systems. PCD system...
It is easy to find errors and inefficient parts of a sequential program, by using a standard debugge...
In the framework of fully permutable loops, tiling has been extensively studied as a source-to-sourc...
In this paper, we survey loop parallelization algorithms, analyzing the dependence representations t...
A parallel programming archetype [Cha94, CMMM95] is an abstraction that captures the common features...
In this paper, an efficient algorithm to simultaneously implement array alignment and data/computati...
In this paper, we compare three nested loops parallelization algorithms (Allen and Kennedy's algorit...
We describe a simple data-parallel kernel language which encapsulates the main data-parallel control...
Automatic parallelization is one of the approaches aimed at a better and easier use of parallel comp...
This thesis intends to show how to efficiently exploit the parallelism present in applications in or...
This report describes three application program interfaces to BPFS, a distributed, modular parallel ...
RÉSUMÉ: L'évolution spectaculaire des technologies dans le domaine du matériel et du logiciel a perm...
We describe the compilation and execution of data-parallel languages for networks of workstations. E...
In this report we address the issue of loop tiling to minimize the completion time of the loop when ...
International audienceEnabling HPC applications to perform efficiently when invoking multiple parall...
We study the computational power of rational Piecewise Constant Derivative (PCD) systems. PCD system...