International audienceWe propose efficient parallel algorithms and implementations on shared memory architectures of LU factorization over a finite field. Compared to the corresponding numerical routines, we have identified three main difficulties specific to linear algebra over finite fields. First, the arithmetic complexity could be dominated by modular reductions. Therefore, it is mandatory to delay as much as possible these reductions while mixing fine-grain parallelizations of tiled iterative and recursive algorithms. Second, fast linear algebra variants, e.g., using Strassen-Winograd algorithm, never suffer from instability and can thus be widely used in cascade with the classical algorithms. There, trade-offs are to be made between s...
The impact of the communication on the performance of numerical algorithms increases with the number...
International audienceWe propose a new algorithm for multiplying dense polynomials with integer coef...
This paper describes the design, implementation and performance of parallel direct dense symmetric...
International audienceWe propose efficient parallel algorithms and implementations on shared memory ...
We propose efficient parallel algorithms and implementations on shared memory architectures of LU fa...
International audienceWe present block algorithms and their implementation for the parallelization o...
International audienceWe present here algorithms for efficient computation of linear algebra problem...
This dissertation focuses on a widely used linear algebra kernel to solve linear systems, that is th...
Our experimental results showed that block based algorithms for numerically intensive applications a...
With the emergence of thread-level parallelism as the primary means for continued improvement of per...
Texte intégral accessible uniquement aux membres de l'Université de LorraineThis dissertation treats...
The LU factorization is an important numerical algorithm for solving systems of linear equations in ...
The LU factorization is an important numerical algorithm for solving systems of linear equations in ...
AbstractThis paper gives improved parallel methods for several exact factorizations of some classes ...
The impact of the communication on the performance of numerical algorithms increases with the number...
The impact of the communication on the performance of numerical algorithms increases with the number...
International audienceWe propose a new algorithm for multiplying dense polynomials with integer coef...
This paper describes the design, implementation and performance of parallel direct dense symmetric...
International audienceWe propose efficient parallel algorithms and implementations on shared memory ...
We propose efficient parallel algorithms and implementations on shared memory architectures of LU fa...
International audienceWe present block algorithms and their implementation for the parallelization o...
International audienceWe present here algorithms for efficient computation of linear algebra problem...
This dissertation focuses on a widely used linear algebra kernel to solve linear systems, that is th...
Our experimental results showed that block based algorithms for numerically intensive applications a...
With the emergence of thread-level parallelism as the primary means for continued improvement of per...
Texte intégral accessible uniquement aux membres de l'Université de LorraineThis dissertation treats...
The LU factorization is an important numerical algorithm for solving systems of linear equations in ...
The LU factorization is an important numerical algorithm for solving systems of linear equations in ...
AbstractThis paper gives improved parallel methods for several exact factorizations of some classes ...
The impact of the communication on the performance of numerical algorithms increases with the number...
The impact of the communication on the performance of numerical algorithms increases with the number...
International audienceWe propose a new algorithm for multiplying dense polynomials with integer coef...
This paper describes the design, implementation and performance of parallel direct dense symmetric...