Abstract Providing high level tools for parallel programming while sustaining a high level of performance has been a challenge that techniques like Domain Specific Embedded Languages try to solve. In previous works, we investi-gated the design of such a DSEL – NT2 – providing a Matlab-like syntax for parallel numerical computations inside a C++ library. In this paper, we show how NT2 has been redesigned for shared memory systems in an extensible and portable way. The new NT2design relies on a tiered Parallel Skeleton system built using asynchronous task management and automatic compile-time task-ification of user level code. We describe how this system can operate various shared memory runtimes and evaluate the design by using several bench...
This paper describes a very high-level approach that aims to orchestrate sequential components writt...
The article describes various options for speeding up calculations on computer systems. These featur...
Parallel programming is hard and programmers still struggle to write code for shared memory multicor...
International audienceProviding high level tools for parallel programming while sustaining a high le...
GPGPUs and other accelerators are becoming a mainstream asset for high-performance computing. Raisin...
We present a new C++ library design for linear algebra computations on high performance architecture...
The increasing complexity of new parallel architectures has widened the gap between adaptability and...
Dans cette thèse, nous présentons des développements de l'approche utilisée dans l'équipe « ParSys »...
International audienceThe increasing complexity of new parallel architectures has widened the gap be...
In this thesis, we present developments to the approach used by the LRI Parsys team to automatically...
With modern advancements in hardware and software technology scaling towards new limits, our compute...
International audienceThis work deals with parallelism in linear algebra routines. We propose a doma...
International audienceThe design and implementation of high level tools for parallel programming is ...
In this paper we describe the main components of the NanosCompiler, an OpenMP compiler whose impleme...
ABSTRACT Parallel numerical software based on the message-passing model is enormously compli-cated. ...
This paper describes a very high-level approach that aims to orchestrate sequential components writt...
The article describes various options for speeding up calculations on computer systems. These featur...
Parallel programming is hard and programmers still struggle to write code for shared memory multicor...
International audienceProviding high level tools for parallel programming while sustaining a high le...
GPGPUs and other accelerators are becoming a mainstream asset for high-performance computing. Raisin...
We present a new C++ library design for linear algebra computations on high performance architecture...
The increasing complexity of new parallel architectures has widened the gap between adaptability and...
Dans cette thèse, nous présentons des développements de l'approche utilisée dans l'équipe « ParSys »...
International audienceThe increasing complexity of new parallel architectures has widened the gap be...
In this thesis, we present developments to the approach used by the LRI Parsys team to automatically...
With modern advancements in hardware and software technology scaling towards new limits, our compute...
International audienceThis work deals with parallelism in linear algebra routines. We propose a doma...
International audienceThe design and implementation of high level tools for parallel programming is ...
In this paper we describe the main components of the NanosCompiler, an OpenMP compiler whose impleme...
ABSTRACT Parallel numerical software based on the message-passing model is enormously compli-cated. ...
This paper describes a very high-level approach that aims to orchestrate sequential components writt...
The article describes various options for speeding up calculations on computer systems. These featur...
Parallel programming is hard and programmers still struggle to write code for shared memory multicor...