In this paper, the index space of the (n×n)-matrix multiply-add problem C = C +A·B is represented as a 3D n×n×n torus. All possible time-scheduling functions to ac-tivate the computation and data rolling inside the 3D torus index space are determined. To maximize efficiency when solving a single problem, we mapped the computations into the 2D n×n toroidal array processor. All optimal 2D data allocations that solve the problem in n multiply-add-roll steps are obtained. The well known Cannon’s algorithm is one of the resulting allocations. We used the optimal data allocations to describe all variants of the GEMM operation on the 2D toroidal array processor. By controling the data movement, the transposition operation is avoided in 75 % of the...
AbstractThis paper presents results of our study on double-precision general matrix-matrix multiplic...
. We present and compare several methods for multiplying banded square matrices. Various storage sch...
Matrix multiplication is significant in a lot of scientific fields, such as mathematics, physics and...
. A distributed algorithm with the same functionality as the single-processor level 3 BLAS operation...
In this paper, I explain a previously published three-dimensional algorithm for multiplying two two-...
Some level-2 and level-3 Distributed Basic Linear Algebra Subroutines (DBLAS) that have been impleme...
This article presents new properties of the mesh array for matrix multiplication. In contrast to the...
AbstractThis paper develops optimal algorithms to multiply an n × n symmetric tridiagonal matrix by:...
Matrix multiplication is commonly used in scientific computation. Given matrices A = (aij ) of size ...
Using super-resolution techniques to estimate the direction that a signal arrived at a radio receive...
In this thesis it is showed how an \(O(n^{4-\epsilon})\) algorithm for the cube multiplication probl...
Dense linear systems of equations are quite common in science and engineering, arising in boundary e...
The generic matrix multiply (GEMM) function is the core element of high-performance linear algebra l...
AbstractA data-flow approach is used to solve dense symmetric systems of equations on a torus-connec...
A parallel matrix multiplication algorithm is presented, and studies of its performance and estimati...
AbstractThis paper presents results of our study on double-precision general matrix-matrix multiplic...
. We present and compare several methods for multiplying banded square matrices. Various storage sch...
Matrix multiplication is significant in a lot of scientific fields, such as mathematics, physics and...
. A distributed algorithm with the same functionality as the single-processor level 3 BLAS operation...
In this paper, I explain a previously published three-dimensional algorithm for multiplying two two-...
Some level-2 and level-3 Distributed Basic Linear Algebra Subroutines (DBLAS) that have been impleme...
This article presents new properties of the mesh array for matrix multiplication. In contrast to the...
AbstractThis paper develops optimal algorithms to multiply an n × n symmetric tridiagonal matrix by:...
Matrix multiplication is commonly used in scientific computation. Given matrices A = (aij ) of size ...
Using super-resolution techniques to estimate the direction that a signal arrived at a radio receive...
In this thesis it is showed how an \(O(n^{4-\epsilon})\) algorithm for the cube multiplication probl...
Dense linear systems of equations are quite common in science and engineering, arising in boundary e...
The generic matrix multiply (GEMM) function is the core element of high-performance linear algebra l...
AbstractA data-flow approach is used to solve dense symmetric systems of equations on a torus-connec...
A parallel matrix multiplication algorithm is presented, and studies of its performance and estimati...
AbstractThis paper presents results of our study on double-precision general matrix-matrix multiplic...
. We present and compare several methods for multiplying banded square matrices. Various storage sch...
Matrix multiplication is significant in a lot of scientific fields, such as mathematics, physics and...