Implementing linear algebra kernels on distributed memory parallel computers raises the problem of data distribution of matrices and vectors among the processors. Block-cyclic distribution seems to suit well for most algorithms. But one has to choose a good compromise for the size of the blocks (to achieve a good efficiency and a good load balancing). This choice heavily depends on each operation, so it is essential to be able to go from one distribution to another very quickly. We present here the algorithms we implemented in the SCALAPACK library. A complexity study is then made that proves the efficiency of our solution. Timing results on a network of SUN workstations and the Cray T3D using PVM corroborates the results.L'implantation de ...
International audienceThis article is devoted to the run-time redistribution of one-dimensional arra...
[[abstract]]Array redistribution is usually required to enhance algorithm performance in many parall...
We present a new fast and scalable matrix multiplication algorithm, called DIMMA (Distribution-Indep...
(eng) Implementing linear algebra kernels on distributed memory parallel computers raises the proble...
Implementing linear algebra kernels on distributed memory parallel computers raises the problem of d...
Implementing linear algebra kernels on distributed memory parallel computers raises the problem of d...
This research aims at creating and providing a framework to describe algorithmic redistribution meth...
This paper discusses some algorithmic issues when computing with a heterogeneous network of workstat...
This paper discusses some algorithmic issues when computing with a heterogeneous network of workstat...
This paper describes the design of ScaLAPACK, a scalable software library for performing dense and b...
International audienceThis paper discusses some algorithmic issues when computing with a heterogeneo...
Run-time array redistribution is necessary to enhance the performance of parallel programs on distri...
International audienceThis paper discusses some algorithmic issues when computing with a heterogeneo...
International audienceThis article is devoted to the run-time redistribution of one-dimensional arra...
International audienceThis article is devoted to the run-time redistribution of one-dimensional arra...
International audienceThis article is devoted to the run-time redistribution of one-dimensional arra...
[[abstract]]Array redistribution is usually required to enhance algorithm performance in many parall...
We present a new fast and scalable matrix multiplication algorithm, called DIMMA (Distribution-Indep...
(eng) Implementing linear algebra kernels on distributed memory parallel computers raises the proble...
Implementing linear algebra kernels on distributed memory parallel computers raises the problem of d...
Implementing linear algebra kernels on distributed memory parallel computers raises the problem of d...
This research aims at creating and providing a framework to describe algorithmic redistribution meth...
This paper discusses some algorithmic issues when computing with a heterogeneous network of workstat...
This paper discusses some algorithmic issues when computing with a heterogeneous network of workstat...
This paper describes the design of ScaLAPACK, a scalable software library for performing dense and b...
International audienceThis paper discusses some algorithmic issues when computing with a heterogeneo...
Run-time array redistribution is necessary to enhance the performance of parallel programs on distri...
International audienceThis paper discusses some algorithmic issues when computing with a heterogeneo...
International audienceThis article is devoted to the run-time redistribution of one-dimensional arra...
International audienceThis article is devoted to the run-time redistribution of one-dimensional arra...
International audienceThis article is devoted to the run-time redistribution of one-dimensional arra...
[[abstract]]Array redistribution is usually required to enhance algorithm performance in many parall...
We present a new fast and scalable matrix multiplication algorithm, called DIMMA (Distribution-Indep...