Most high-performance, scientific libraries have adopted hybrid parallelization schemes - such as the popular MPI+OpenMP hybridization - to benefit from the capacities of modern distributed-memory machines. While these approaches have shown to achieve high performance, they require a lot of effort to design and maintain sophisticated synchronization/communication strategies. On the other hand, task-based programming paradigms aim at delegating this burden to a runtime system for maximizing productivity. In this article, we assess the potential of task-based fast multipole methods (FMM) on clusters of multicore processors. We propose both a hybrid MPI+task FMM parallelization and a pure task-based parallelization where the MPI c...
Parallel programs need to manage the time trade-off between synchronization and computation. A high ...
Les problèmes d'optimisation et de recherche sont souvent NP-complets et des techniques de force bru...
In this paper, we describe and evaluate an extension of the Chameleon library to operate with hierar...
The emergence of accelerators as standard computing resources on supercomputers and the subsequent a...
With the advent of complex modern architectures, the low-levelparadigms long considered sufficient t...
High performance \FMM is crucial for the numerical simulation of many physical problems. In a previo...
RÉSUMÉ: L'évolution spectaculaire des technologies dans le domaine du matériel et du logiciel a perm...
A now-classical way of meeting the increasing demand for computing speed by HPC applications is the ...
Task-based models and runtimes are quite popular in the HPC community. Theyhelp to implement applica...
Clusters of multicore/GPU nodes connected with a fast network offer very high therotical peak perfor...
Miniaturization of electronic components has led to the introduction of complex electronic systems w...
Fast Multipole Methods (FMM) are a fundamental operation for the simulation of many physical problem...
This thesis intends to show how to efficiently exploit the parallelism present in applications in or...
Since several years, classical multiprocessor systems have evolved to multicores, which tightly inte...
As single processing unit performance has reached a technological limit, the power wall, the past de...
Parallel programs need to manage the time trade-off between synchronization and computation. A high ...
Les problèmes d'optimisation et de recherche sont souvent NP-complets et des techniques de force bru...
In this paper, we describe and evaluate an extension of the Chameleon library to operate with hierar...
The emergence of accelerators as standard computing resources on supercomputers and the subsequent a...
With the advent of complex modern architectures, the low-levelparadigms long considered sufficient t...
High performance \FMM is crucial for the numerical simulation of many physical problems. In a previo...
RÉSUMÉ: L'évolution spectaculaire des technologies dans le domaine du matériel et du logiciel a perm...
A now-classical way of meeting the increasing demand for computing speed by HPC applications is the ...
Task-based models and runtimes are quite popular in the HPC community. Theyhelp to implement applica...
Clusters of multicore/GPU nodes connected with a fast network offer very high therotical peak perfor...
Miniaturization of electronic components has led to the introduction of complex electronic systems w...
Fast Multipole Methods (FMM) are a fundamental operation for the simulation of many physical problem...
This thesis intends to show how to efficiently exploit the parallelism present in applications in or...
Since several years, classical multiprocessor systems have evolved to multicores, which tightly inte...
As single processing unit performance has reached a technological limit, the power wall, the past de...
Parallel programs need to manage the time trade-off between synchronization and computation. A high ...
Les problèmes d'optimisation et de recherche sont souvent NP-complets et des techniques de force bru...
In this paper, we describe and evaluate an extension of the Chameleon library to operate with hierar...