We present accurate time and energy piece-wise models of high-performance multi-threaded implementations for the general matrix multiplication, triangular system solve with multiple right-hand sides, and symmetric rank-k update. Furthermore, these are then assembled to provide accurate models of the Cholesky factorization built on top of these Level-3 BLAS operations. Our models consider the costs, in terms of time and energy, of the floating-point operations involved in the routines as well as the overhead due to data movements across the levels of the memory hierarchy. The accuracy of the multi-threaded models is tested on an Intel Xeon E5-2620 processor, reporting relative errors for the Cholesky factorization that are, respectively, aro...
[[abstract]]In linear algebra, Cholesky factorization is useful in solving a system of equations wit...
In this paper we present a new parallel algorithm for computing the Cholesky decomposition (LL^T) of...
We describe a parallel algorithm for finding the Cholesky factorization of a sparse symmetric posit...
This is the author’s version of a work that was accepted for publication in Simulation Modelling Pra...
In this paper we introduce a model for the total energy consumption of the Cholesky factorization on...
The bottleneck of most data analyzing systems, signal processing systems, and intensive computing sy...
We present accurate piece-wise models for the time and energy costs of high performance implementati...
Cholesky factorization is a fundamental problem in most engineering and science computation applicat...
In this paper, we propose a model for the energy consumption of the concurrent execution of three ke...
A Choleski method is described and used to solve linear systems of equations that arise in large sca...
Abstract—Currently, state of the art libraries, like MAGMA, focus on very large linear algebra probl...
ABSTRACT — This paper proposes a hardware accelerator for Cholesky decomposition on FPGAs by designi...
We discuss some performance issues of the tiled Cholesky factorization on non-uniform memory access-...
AbstractSolving a large number of relatively small linear systems has recently drawn more attention ...
International audienceThe Sony/Toshiba/IBM (STI) CELL processor introduces pioneering solutions in p...
[[abstract]]In linear algebra, Cholesky factorization is useful in solving a system of equations wit...
In this paper we present a new parallel algorithm for computing the Cholesky decomposition (LL^T) of...
We describe a parallel algorithm for finding the Cholesky factorization of a sparse symmetric posit...
This is the author’s version of a work that was accepted for publication in Simulation Modelling Pra...
In this paper we introduce a model for the total energy consumption of the Cholesky factorization on...
The bottleneck of most data analyzing systems, signal processing systems, and intensive computing sy...
We present accurate piece-wise models for the time and energy costs of high performance implementati...
Cholesky factorization is a fundamental problem in most engineering and science computation applicat...
In this paper, we propose a model for the energy consumption of the concurrent execution of three ke...
A Choleski method is described and used to solve linear systems of equations that arise in large sca...
Abstract—Currently, state of the art libraries, like MAGMA, focus on very large linear algebra probl...
ABSTRACT — This paper proposes a hardware accelerator for Cholesky decomposition on FPGAs by designi...
We discuss some performance issues of the tiled Cholesky factorization on non-uniform memory access-...
AbstractSolving a large number of relatively small linear systems has recently drawn more attention ...
International audienceThe Sony/Toshiba/IBM (STI) CELL processor introduces pioneering solutions in p...
[[abstract]]In linear algebra, Cholesky factorization is useful in solving a system of equations wit...
In this paper we present a new parallel algorithm for computing the Cholesky decomposition (LL^T) of...
We describe a parallel algorithm for finding the Cholesky factorization of a sparse symmetric posit...