It is well established that mixed precision algorithms that factorize a matrix at a precision lower than the working precision can reduce the execution time of parallel solvers for dense linear systems. Much less is known about the efficiency of mixed precision parallel algorithms for sparse linear systems, and existing work focuses on single core experiments. We evaluate the benefits of using single precision arithmetic in solving a double precision sparse linear systems using multiple cores, focusing on the key components of LU factorization and matrix–vector products. We find that single precision sparse LU factorization is prone to a severe loss of performance due to the intrusion of subnormal numbers. We identify a mechanism...
We present an out-of-core sparse nonsymmetric LU-factorization algorithm with partial pivoting. We h...
We present a simulation-based performance model to analyze a parallel sparse LU factorization algori...
Motivated by the demand in machine learning, modern computer hardware is increas- ingly supporting r...
It is well established that reduced precision arithmetic can be exploited to accelerate the solution...
On many current and emerging computing architectures, single-precision calculations are at least twi...
The standard LU factorization-based solution process for linear systems can be enhanced in speed or ...
By using a combination of 32-bit and 64-bit floating point arithmetic the performance of many sparse...
We present a performance model to analyze a parallel sparseLU factorization algorithm on modern cach...
Today's floating-point arithmetic landscape is broader than ever. While scientific computing has tra...
L'accessibilité grandissante des arithmétiques à précision faible (tfloat32, fp16, bfloat16, fp8) da...
We present an overview of parallel direct methods for solving sparse systems of linear equations, fo...
It is important to have a fast, robust and scalable algorithm to solve a sparse linear system AX=B. ...
This is the pre-peer reviewed version of the following article: Adaptive precision in block‐Jacobi p...
AbstractWe review the influence of the advent of high-performance computing on the solution of linea...
We investigate performance characteristics for the LU factorization of large matrices with various ...
We present an out-of-core sparse nonsymmetric LU-factorization algorithm with partial pivoting. We h...
We present a simulation-based performance model to analyze a parallel sparse LU factorization algori...
Motivated by the demand in machine learning, modern computer hardware is increas- ingly supporting r...
It is well established that reduced precision arithmetic can be exploited to accelerate the solution...
On many current and emerging computing architectures, single-precision calculations are at least twi...
The standard LU factorization-based solution process for linear systems can be enhanced in speed or ...
By using a combination of 32-bit and 64-bit floating point arithmetic the performance of many sparse...
We present a performance model to analyze a parallel sparseLU factorization algorithm on modern cach...
Today's floating-point arithmetic landscape is broader than ever. While scientific computing has tra...
L'accessibilité grandissante des arithmétiques à précision faible (tfloat32, fp16, bfloat16, fp8) da...
We present an overview of parallel direct methods for solving sparse systems of linear equations, fo...
It is important to have a fast, robust and scalable algorithm to solve a sparse linear system AX=B. ...
This is the pre-peer reviewed version of the following article: Adaptive precision in block‐Jacobi p...
AbstractWe review the influence of the advent of high-performance computing on the solution of linea...
We investigate performance characteristics for the LU factorization of large matrices with various ...
We present an out-of-core sparse nonsymmetric LU-factorization algorithm with partial pivoting. We h...
We present a simulation-based performance model to analyze a parallel sparse LU factorization algori...
Motivated by the demand in machine learning, modern computer hardware is increas- ingly supporting r...