We propose a reproducible variant of the unblocked LU factorization for graphics processor units (GPUs). For this purpose, we build upon Level-1/2 BLAS kernels that deliver correctly-rounded and reproducible results for the dot (inner) product, vector scaling, and the matrix-vector product. In addition, we draw a strategy to enhance the accuracy of the triangular solve via iterative refinement. Following a bottom-up approach, we finally construct a reproducible unblocked implementation of the LU factorization for GPUs, which accommodates partial pivoting for stability and can be eventually integrated in a high performance and stable algorithm for the (blocked) LU factorization
AbstractWe study several solvers for the solution of general linear systems where the main objective...
The sparse matrix solver is a critical component in circuit simulators. Some researches have develop...
AbstractBy a block representation of LU factorization for a general matrix introduced by Amodio and ...
We propose a reproducible variant of the unblocked LU factorization for graphics processor units (GP...
International audienceIn this article, we address the problem of reproducibility of the blocked LU f...
International audienceThe process of finding the solution of a linear system of equations is often t...
AbstractLU factorization is the most computationally intensive step in solving systems of linear equ...
The LU factorization is an important numerical algorithm for solving systems of linear equations in ...
Abstract—Multicore architectures enhanced with multiple GPUs are likely to become mainstream High Pe...
International audienceModern GPUs equipped with mixed precision tensor core units present great pote...
Lower-upper (LU) factorization is widely used in many scientific computations. It is one of the most...
The LU factorization is an important numerical algorithm for solving systems of linear equations in ...
AbstractIn recent years, parallel processing has been widely used in the computer industry. Software...
Abstract—LU factorization with partial pivoting is a canonical numerical procedure and the main comp...
International audienceWe present new algorithms to detect and correct errors in the lower-upper fact...
AbstractWe study several solvers for the solution of general linear systems where the main objective...
The sparse matrix solver is a critical component in circuit simulators. Some researches have develop...
AbstractBy a block representation of LU factorization for a general matrix introduced by Amodio and ...
We propose a reproducible variant of the unblocked LU factorization for graphics processor units (GP...
International audienceIn this article, we address the problem of reproducibility of the blocked LU f...
International audienceThe process of finding the solution of a linear system of equations is often t...
AbstractLU factorization is the most computationally intensive step in solving systems of linear equ...
The LU factorization is an important numerical algorithm for solving systems of linear equations in ...
Abstract—Multicore architectures enhanced with multiple GPUs are likely to become mainstream High Pe...
International audienceModern GPUs equipped with mixed precision tensor core units present great pote...
Lower-upper (LU) factorization is widely used in many scientific computations. It is one of the most...
The LU factorization is an important numerical algorithm for solving systems of linear equations in ...
AbstractIn recent years, parallel processing has been widely used in the computer industry. Software...
Abstract—LU factorization with partial pivoting is a canonical numerical procedure and the main comp...
International audienceWe present new algorithms to detect and correct errors in the lower-upper fact...
AbstractWe study several solvers for the solution of general linear systems where the main objective...
The sparse matrix solver is a critical component in circuit simulators. Some researches have develop...
AbstractBy a block representation of LU factorization for a general matrix introduced by Amodio and ...