We propose two novel techniques for overcoming load-imbalance encountered when implementing so-called look-ahead mechanisms in relevant dense matrix factorizations for the solution of linear systems. Both techniques target the scenario where two thread teams are created/activated during the factorization, with each team in charge of performing an independent task/branch of execution. The first technique promotes worker sharing (WS) between the two tasks, allowing the threads of the task that completes first to be reallocated for use by the costlier task. The second technique allows a fast task to alert the slower task of completion, enforcing the early termination (ET) of the second task, and a smooth transition of the factorization procedu...
This paper presents CALU, a Communication Avoiding algorithm for the LU factorization of dense matri...
The solution of dense systems of linear equations is at the heart of numerical computations. Such sy...
We analyze the benefits of look-ahead in the parallel execution of the LU factorization with partial...
We propose two novel techniques for overcoming load-imbalance encountered when implementing so-calle...
We propose two novel techniques for overcoming load-imbalance encountered when implementing so-calle...
With the emergence of thread-level parallelism as the primary means for continued improvement of per...
We investigate a parallelization strategy for dense matrix factorization (DMF) algorithms, using Ope...
This dissertation focuses on a widely used linear algebra kernel to solve linear systems, that is th...
We present an out-of-core sparse nonsymmetric LU-factorization algorithm with partial pivoting. We h...
Many linear algebra algorithms require explicit row/column swapping mainly when pivoting operations ...
We investigate several parallel algorithmic variants of the LU factorization with partial pivoting (...
On multicomputers the partial pivoting phase of the LU factorization has a peculiar load unbalancing...
Abstract—LU factorization with partial pivoting is a canonical numerical procedure and the main comp...
The LU factorization is an important numerical algorithm for solving systems of linear equations in ...
The LU factorization is an important numerical algorithm for solving systems of linear equations in ...
This paper presents CALU, a Communication Avoiding algorithm for the LU factorization of dense matri...
The solution of dense systems of linear equations is at the heart of numerical computations. Such sy...
We analyze the benefits of look-ahead in the parallel execution of the LU factorization with partial...
We propose two novel techniques for overcoming load-imbalance encountered when implementing so-calle...
We propose two novel techniques for overcoming load-imbalance encountered when implementing so-calle...
With the emergence of thread-level parallelism as the primary means for continued improvement of per...
We investigate a parallelization strategy for dense matrix factorization (DMF) algorithms, using Ope...
This dissertation focuses on a widely used linear algebra kernel to solve linear systems, that is th...
We present an out-of-core sparse nonsymmetric LU-factorization algorithm with partial pivoting. We h...
Many linear algebra algorithms require explicit row/column swapping mainly when pivoting operations ...
We investigate several parallel algorithmic variants of the LU factorization with partial pivoting (...
On multicomputers the partial pivoting phase of the LU factorization has a peculiar load unbalancing...
Abstract—LU factorization with partial pivoting is a canonical numerical procedure and the main comp...
The LU factorization is an important numerical algorithm for solving systems of linear equations in ...
The LU factorization is an important numerical algorithm for solving systems of linear equations in ...
This paper presents CALU, a Communication Avoiding algorithm for the LU factorization of dense matri...
The solution of dense systems of linear equations is at the heart of numerical computations. Such sy...
We analyze the benefits of look-ahead in the parallel execution of the LU factorization with partial...