International audienceThis paper describes how to leverage a task-based implementation of the polar decomposition on massively parallel systems using the PARSEC dynamic runtime system. Based on a formulation of the iterative QR Dynamically-Weighted Halley (QDWH) algorithm, our novel implementation reduces data traffic while exploiting high concurrency from the underlying hardware architecture. First, we replace the most time-consuming classical QR factorization phase with a new hierarchical variant, customized for the specific structure of the matrix during the QDWH iterations. The newly developed hierarchical QR for QDWH exploits not only the matrix structure, but also shortens the length of the critical path to maximize hardware occupancy...
The solution of dense systems of linear equations is at the heart of numerical computations. Such sy...
This paper addresses the efficient exploitation of task-level parallelism, present in many dense lin...
Hierarchically semiseparable (HSS) matrix algorithms are emerging techniques in constructing the sup...
International audienceThis paper introduces the first asynchronous, task-based formulation of the po...
This paper introduces the first asynchronous, task-based implementation of the polar decomposition o...
We present a high-performance implementation of the Polar Decomposition (PD) on distributed-memory s...
Abstract This article introduces a new systolic algorithm for QR factorization, and its implementati...
This paper introduces a new parallel QR decomposition algorithm. The novel load balancing method des...
International audienceThis paper describes a new QR factorization algorithm which is especially desi...
International audienceThis article introduces a new systolic algorithm for QR factorization, and its...
International audienceThis paper describes a new QR factorization algorithm which is especially desi...
International audienceAs multicore systems continue to gain ground in the high‐performance computing...
Challenges introduced by highly hybrid many-cores architectures have a lasting impact on the portabi...
International audienceTo exploit the potential of multicore architectures, recent dense linear algeb...
A novel variant of the parallel QR algorithm for solving dense nonsymmetric eigenvalue problems on h...
The solution of dense systems of linear equations is at the heart of numerical computations. Such sy...
This paper addresses the efficient exploitation of task-level parallelism, present in many dense lin...
Hierarchically semiseparable (HSS) matrix algorithms are emerging techniques in constructing the sup...
International audienceThis paper introduces the first asynchronous, task-based formulation of the po...
This paper introduces the first asynchronous, task-based implementation of the polar decomposition o...
We present a high-performance implementation of the Polar Decomposition (PD) on distributed-memory s...
Abstract This article introduces a new systolic algorithm for QR factorization, and its implementati...
This paper introduces a new parallel QR decomposition algorithm. The novel load balancing method des...
International audienceThis paper describes a new QR factorization algorithm which is especially desi...
International audienceThis article introduces a new systolic algorithm for QR factorization, and its...
International audienceThis paper describes a new QR factorization algorithm which is especially desi...
International audienceAs multicore systems continue to gain ground in the high‐performance computing...
Challenges introduced by highly hybrid many-cores architectures have a lasting impact on the portabi...
International audienceTo exploit the potential of multicore architectures, recent dense linear algeb...
A novel variant of the parallel QR algorithm for solving dense nonsymmetric eigenvalue problems on h...
The solution of dense systems of linear equations is at the heart of numerical computations. Such sy...
This paper addresses the efficient exploitation of task-level parallelism, present in many dense lin...
Hierarchically semiseparable (HSS) matrix algorithms are emerging techniques in constructing the sup...