We address some key issues in designing dense linear algebra (DLA) algorithms that are common for both multi/many-cores and special purpose architectures (in particular GPUs). We present them in the context of an LU factorization algorithm, where randomization techniques are used as an alternative to pivoting. This approach yields an algorithm based entirely on a collection of small Level 3 BLAS type computational tasks, which has emerged as a common goal in designing DLA algorithms for new architectures. Other common trends, also considered here, are block asynchronous task execution and “Block” layouts for the data associated with the separate tasks. We present numerical results and other specific experiments with DLA algorithms on NVIDIA...
Abstract: Few realize that, for large matrices, many dense matrix computations achieve nearly the sa...
AbstractOne-sided dense matrix factorizations are important computational kernels in many scientific...
In this PhD thesis, we study algorithms and implementations to accelerate the solution of dense line...
Abstract. We address some key issues in designing dense linear algebra (DLA) algorithms that are com...
Abstract. We address some key issues in designing dense linear alge-bra (DLA) algorithms that are co...
If multicore is a disruptive technology, try to imagine hybrid multicore systems enhanced with accel...
Dense linear algebra(DLA) is one of the most seven important kernels in high performance computing. ...
Abstract. If multicore is a disruptive technology, try to imagine hybrid multicore systems enhanced ...
Abstract. Implementations of the Basic Linear Algebra Subprograms (BLAS) interface are major buildin...
Abstract: If multicore is a disruptive technology, try to imagine hybrid multicore systems enhanced ...
Achieving high computation efficiency, in terms of Cycles per Instruction (CPI), for high-performanc...
International audienceNowadays many clusters integrate GPUs accelerators in their architectures that...
AbstractIn recent years, parallel processing has been widely used in the computer industry. Software...
We present several algorithms to compute the solution of a linear system of equa-tions on a GPU, as ...
Abstract. We present an efficient and scalable programming model for the development of linear algeb...
Abstract: Few realize that, for large matrices, many dense matrix computations achieve nearly the sa...
AbstractOne-sided dense matrix factorizations are important computational kernels in many scientific...
In this PhD thesis, we study algorithms and implementations to accelerate the solution of dense line...
Abstract. We address some key issues in designing dense linear algebra (DLA) algorithms that are com...
Abstract. We address some key issues in designing dense linear alge-bra (DLA) algorithms that are co...
If multicore is a disruptive technology, try to imagine hybrid multicore systems enhanced with accel...
Dense linear algebra(DLA) is one of the most seven important kernels in high performance computing. ...
Abstract. If multicore is a disruptive technology, try to imagine hybrid multicore systems enhanced ...
Abstract. Implementations of the Basic Linear Algebra Subprograms (BLAS) interface are major buildin...
Abstract: If multicore is a disruptive technology, try to imagine hybrid multicore systems enhanced ...
Achieving high computation efficiency, in terms of Cycles per Instruction (CPI), for high-performanc...
International audienceNowadays many clusters integrate GPUs accelerators in their architectures that...
AbstractIn recent years, parallel processing has been widely used in the computer industry. Software...
We present several algorithms to compute the solution of a linear system of equa-tions on a GPU, as ...
Abstract. We present an efficient and scalable programming model for the development of linear algeb...
Abstract: Few realize that, for large matrices, many dense matrix computations achieve nearly the sa...
AbstractOne-sided dense matrix factorizations are important computational kernels in many scientific...
In this PhD thesis, we study algorithms and implementations to accelerate the solution of dense line...