This paper discusses optimizing computational linear algebra algorithms on a ring cluster of IBM RS/6000s. We offer the results of a block Cholesky factorization and the underlying BLAS to demonstrate the advantage of using blocking algorithms on such architectures. A thorough analysis of the complexities of the problem is provided. Different communication protocols, serial versus parallel execution, and optimization of data traffic is explored. We provide insight into some of the techniques we have observed in exploiting this particular design. The implementations demonstrate that this important architecture can be utilized effectively for sufficiently large dense matrix computations
In this paper we consider the data distribution and data movement issues related to the solution of ...
International audienceAs multicore systems continue to gain ground in the high performance computing...
During the last half-decade, a number of research efforts have centered around developing software f...
This paper discusses the scalability of Cholesky, LU, and QR factorization routines on MIMD distribu...
Block-cyclic order elimination algorithms for LU and QR factorization and solve routines are describ...
The solution of dense systems of linear equations is at the heart of numerical computations. Such sy...
This report has been developed over the work done in the deliverable [Nava94] There it was shown tha...
In this paper, we present a new load balancing technique, called panel scattering, which is generall...
Our experimental results showed that block based algorithms for numerically intensive applications a...
The goal of the LAPACK project is to provide efficient and portable software for dense numerical lin...
In this paper, we analyse and compare the techniques of algorithmic blocking and (storage blocking w...
Dense linear algebra computations are essential to nearly every problem in scientific computing and ...
This paper discusses the design of linear algebra libraries for high performance computers. Particul...
Linear systems and the solving of those is an important tool in many areas of science. The solving o...
With the emergence of thread-level parallelism as the primary means for continued improvement of per...
In this paper we consider the data distribution and data movement issues related to the solution of ...
International audienceAs multicore systems continue to gain ground in the high performance computing...
During the last half-decade, a number of research efforts have centered around developing software f...
This paper discusses the scalability of Cholesky, LU, and QR factorization routines on MIMD distribu...
Block-cyclic order elimination algorithms for LU and QR factorization and solve routines are describ...
The solution of dense systems of linear equations is at the heart of numerical computations. Such sy...
This report has been developed over the work done in the deliverable [Nava94] There it was shown tha...
In this paper, we present a new load balancing technique, called panel scattering, which is generall...
Our experimental results showed that block based algorithms for numerically intensive applications a...
The goal of the LAPACK project is to provide efficient and portable software for dense numerical lin...
In this paper, we analyse and compare the techniques of algorithmic blocking and (storage blocking w...
Dense linear algebra computations are essential to nearly every problem in scientific computing and ...
This paper discusses the design of linear algebra libraries for high performance computers. Particul...
Linear systems and the solving of those is an important tool in many areas of science. The solving o...
With the emergence of thread-level parallelism as the primary means for continued improvement of per...
In this paper we consider the data distribution and data movement issues related to the solution of ...
International audienceAs multicore systems continue to gain ground in the high performance computing...
During the last half-decade, a number of research efforts have centered around developing software f...