The objective of this paper is to extend and redesign the block matrix reduction applied for the family of two-sided factorizations, introduced by Dongarra et al. [9], to the context of multicore architec- tures using algorithms-by-tiles. In particular, the Block Hessenberg Re- duction is very often used as a pre-processing step in solving dense linear algebra problems, such as the standard eigenvalue problem. Although expensive, orthogonal transformations are commonly used for this re- duction because they guarantee stability, as opposed to Gaussian Elimi- nation. Two versions of the Block Hessenberg Reduction are presented in this paper, the rst one with Householder re ectors and the second one with Givens rotations. A short investigation...
International audienceTo exploit the potential of multicore architectures, recent dense linear algeb...
International audienceWe propose efficient parallel algorithms and implementations on shared memory ...
With the emergence of thread-level parallelism as the primary means for continued improvement of per...
The objective of this paper is to extend, in the context of multicore architectures, the concepts of...
Abstract. The objective of this paper is to extend, in the context of multicore architectures, the c...
H-matrices offer log-linear storage and computations costs, thanks to a controlled accuracy loss. Th...
This thesis considers two problems in numerical linear algebra and high performance computing (HPC):...
In many scientific applications, eigenvalues of a matrix have to be computed. By first reducing a ma...
International audienceTo exploit the potential of multicore architectures, recent dense linear algeb...
Parallel computing a b s t r a c t A block tridiagonal matrix is factored with minimal fill-in using...
Dense linear algebra represents fundamental building blocks in many computational science and engine...
There exist algorithms, also called "fast" algorithms, which exploit the special structure of Toepli...
Our experimental results showed that block based algorithms for numerically intensive applications a...
In this paper, we present an algorithm for the reduction to block upper-Hessenberg form which can be...
We propose efficient parallel algorithms and implementations on shared memory architectures of LU fa...
International audienceTo exploit the potential of multicore architectures, recent dense linear algeb...
International audienceWe propose efficient parallel algorithms and implementations on shared memory ...
With the emergence of thread-level parallelism as the primary means for continued improvement of per...
The objective of this paper is to extend, in the context of multicore architectures, the concepts of...
Abstract. The objective of this paper is to extend, in the context of multicore architectures, the c...
H-matrices offer log-linear storage and computations costs, thanks to a controlled accuracy loss. Th...
This thesis considers two problems in numerical linear algebra and high performance computing (HPC):...
In many scientific applications, eigenvalues of a matrix have to be computed. By first reducing a ma...
International audienceTo exploit the potential of multicore architectures, recent dense linear algeb...
Parallel computing a b s t r a c t A block tridiagonal matrix is factored with minimal fill-in using...
Dense linear algebra represents fundamental building blocks in many computational science and engine...
There exist algorithms, also called "fast" algorithms, which exploit the special structure of Toepli...
Our experimental results showed that block based algorithms for numerically intensive applications a...
In this paper, we present an algorithm for the reduction to block upper-Hessenberg form which can be...
We propose efficient parallel algorithms and implementations on shared memory architectures of LU fa...
International audienceTo exploit the potential of multicore architectures, recent dense linear algeb...
International audienceWe propose efficient parallel algorithms and implementations on shared memory ...
With the emergence of thread-level parallelism as the primary means for continued improvement of per...