International audienceThis article introduces a new systolic algorithm for QR factorization, and its implementation on a supercomputing cluster of multicore nodes. The algorithm targets a virtual 3D-array and requires only local communications. The implementation of the algorithm uses threads at the node level, and MPI for inter-node communications. The complexity of the implementation is addressed with the PaRSEC software, which takes as input a parametrized dependence graph, which is derived from the algorithm, and only requires the user to decide, at the high-level, the allocation of tasks to nodes. We show that the new algorithm exhibits competitive performance with state-of-the-art QR routines on a supercomputer called Kraken, which sh...
International audienceIn this paper we study the performance of two classical dense linear algebra a...
The increasing complexity of modern computer architectures has greatly influenced algorithm design. ...
International audienceTo exploit the potential of multicore architectures, recent dense linear algeb...
International audienceThis article introduces a new systolic algorithm for QR factorization, and its...
Abstract This article introduces a new systolic algorithm for QR factorization, and its implementati...
International audienceThis paper describes a new QR factorization algorithm which is especially desi...
International audienceThis paper describes a new QR factorization algorithm which is especially desi...
International audienceThis paper describes how to leverage a task-based implementation of the polar ...
International audienceOne of the major trends in the design of exascale architectures is the use of ...
Abstract—A systolic array provides an alternative comput-ing paradigm to the von Neuman architecture...
A tuned and scalable fast multipole method as a preeminent algorithm for exascale systems Rio Yokota...
Multicomputer (distributed memory MIMD machines) have emerged as inexpensive, yet powerful parallel...
International audienceAs multicore systems continue to gain ground in the high‐performance computing...
This study focuses on the performance of two classical dense linear algebra algorithms, the LU and t...
A novel variant of the parallel QR algorithm for solving dense nonsymmetric eigenvalue problems on h...
International audienceIn this paper we study the performance of two classical dense linear algebra a...
The increasing complexity of modern computer architectures has greatly influenced algorithm design. ...
International audienceTo exploit the potential of multicore architectures, recent dense linear algeb...
International audienceThis article introduces a new systolic algorithm for QR factorization, and its...
Abstract This article introduces a new systolic algorithm for QR factorization, and its implementati...
International audienceThis paper describes a new QR factorization algorithm which is especially desi...
International audienceThis paper describes a new QR factorization algorithm which is especially desi...
International audienceThis paper describes how to leverage a task-based implementation of the polar ...
International audienceOne of the major trends in the design of exascale architectures is the use of ...
Abstract—A systolic array provides an alternative comput-ing paradigm to the von Neuman architecture...
A tuned and scalable fast multipole method as a preeminent algorithm for exascale systems Rio Yokota...
Multicomputer (distributed memory MIMD machines) have emerged as inexpensive, yet powerful parallel...
International audienceAs multicore systems continue to gain ground in the high‐performance computing...
This study focuses on the performance of two classical dense linear algebra algorithms, the LU and t...
A novel variant of the parallel QR algorithm for solving dense nonsymmetric eigenvalue problems on h...
International audienceIn this paper we study the performance of two classical dense linear algebra a...
The increasing complexity of modern computer architectures has greatly influenced algorithm design. ...
International audienceTo exploit the potential of multicore architectures, recent dense linear algeb...