In this thesis, we present developments to the approach used by the LRI Parsys team to automatically translate MATLAB-like scientific codes into high performance production codes. To reach a high level of performance, we have combined C++ template meta-programming and asynchronous parallel programming to analyze each expression and detect parallelism opportunities first, and then to ensure near-optimal use of the available resources of multi-core machines. To link these two stages of the code generation process, we have implemented a solution based on multi-level algorithmic skeletons. We have implemented our tools in the NT2 library and evaluated them with several significant scientific benchmarks.Dans cette thèse, nous présentons des déve...
Since a decade, computing systems evolved to parallel and heterogeneous architectures. Composed of s...
Application specific instruction set processors (ASIP) are a well known compromise between the high ...
Automatic parallelization is one of the approaches aimed at a better and easier use of parallel comp...
Distributed memory machines consisting of multiple autonomous processors connected by a network are ...
As single processing unit performance has reached a technological limit, the power wall, the past de...
Jury de soutenance : DR, DHOME Michel, President PR, MIGUET Serge, Rapporteur MCF-HDR, HOUZET Domini...
Clusters of multicore/GPU nodes connected with a fast network offer very high therotical peak perfor...
This thesis intends to show how to efficiently exploit the parallelism present in applications in or...
International audienceEnabling HPC applications to perform efficiently when invoking multiple parall...
Supercomputing plays an important role in several innovative fields, speeding up prototyping or vali...
Scientific applications have an increasing need of resources and many grand scientific challenges re...
The development and maintenance of high-performance scientific computing software is a complex task....
Learning stochastic models generating sequences has many applications in natural language processing...
Heterogeneous architectures have been widely used in the domain of high performance computing. Howev...
Miniaturization of electronic components has led to the introduction of complex electronic systems w...
Since a decade, computing systems evolved to parallel and heterogeneous architectures. Composed of s...
Application specific instruction set processors (ASIP) are a well known compromise between the high ...
Automatic parallelization is one of the approaches aimed at a better and easier use of parallel comp...
Distributed memory machines consisting of multiple autonomous processors connected by a network are ...
As single processing unit performance has reached a technological limit, the power wall, the past de...
Jury de soutenance : DR, DHOME Michel, President PR, MIGUET Serge, Rapporteur MCF-HDR, HOUZET Domini...
Clusters of multicore/GPU nodes connected with a fast network offer very high therotical peak perfor...
This thesis intends to show how to efficiently exploit the parallelism present in applications in or...
International audienceEnabling HPC applications to perform efficiently when invoking multiple parall...
Supercomputing plays an important role in several innovative fields, speeding up prototyping or vali...
Scientific applications have an increasing need of resources and many grand scientific challenges re...
The development and maintenance of high-performance scientific computing software is a complex task....
Learning stochastic models generating sequences has many applications in natural language processing...
Heterogeneous architectures have been widely used in the domain of high performance computing. Howev...
Miniaturization of electronic components has led to the introduction of complex electronic systems w...
Since a decade, computing systems evolved to parallel and heterogeneous architectures. Composed of s...
Application specific instruction set processors (ASIP) are a well known compromise between the high ...
Automatic parallelization is one of the approaches aimed at a better and easier use of parallel comp...