The multiplication of a matrix by its transpose, ATA, appears as an intermediate operation in the solution of a wide set of problems. In this paper, we propose a new cache-oblivious algorithm (AtA) for computing this product, based upon the classical Strassen algorithm as a sub-routine. In particular, we decrease the computational cost to the time required by Strassen’s algorithm, amounting to floating point operations. AtA works for generic rectangular matrices, and exploits the peculiar symmetry of the resulting product matrix for saving memory. In addition, we provide an extensive implementation study of AtA in a shared memory system, and extend its applicability to a distributed environment. To support our findings, we compare o...
In this paper a non-recursive Strassen\u2019s matrix multiplication algorithm is presented. This new...
This paper describes parallel matrix transpose algorithms on distributed memory concurrent processor...
International audienceMapReduce is one of the most popular distributed programming paradigms that al...
Parallel matrix multiplication is one of the most studied fun-damental problems in distributed and h...
Parallel matrix multiplication is one of the most studied fun-damental problems in distributed and h...
Abstract: Strassen’s algorithm to multiply two n×n matrices reduces the asymptotic operation count f...
We present a parallel method for matrix multiplication on distributedmemory MIMD architectures based...
Many fast algorithms in arithmetic complexity have hierarchical or recursive structures that make ef...
In this study, we propose a simple method for fault-tolerant Strassen-like matrix multiplications. T...
Strassen's algorithm for matrix multiplication gains its lower arithmetic complexityatthe expe...
A parallel algorithm has perfect strong scaling if its running time on P processors is linear in 1/P...
The paper presents analysis of matrix multiplication algorithms from the point of view of their effi...
[[abstract]]We present a parallel method for matrix multiplication on distributed-memory MIMD archit...
International audienceThis paper presents a secure multiparty computation protocol for the Strassen-...
AbstractWe present a parallel method for matrix multiplication on distributed-memory MIMD architectu...
In this paper a non-recursive Strassen\u2019s matrix multiplication algorithm is presented. This new...
This paper describes parallel matrix transpose algorithms on distributed memory concurrent processor...
International audienceMapReduce is one of the most popular distributed programming paradigms that al...
Parallel matrix multiplication is one of the most studied fun-damental problems in distributed and h...
Parallel matrix multiplication is one of the most studied fun-damental problems in distributed and h...
Abstract: Strassen’s algorithm to multiply two n×n matrices reduces the asymptotic operation count f...
We present a parallel method for matrix multiplication on distributedmemory MIMD architectures based...
Many fast algorithms in arithmetic complexity have hierarchical or recursive structures that make ef...
In this study, we propose a simple method for fault-tolerant Strassen-like matrix multiplications. T...
Strassen's algorithm for matrix multiplication gains its lower arithmetic complexityatthe expe...
A parallel algorithm has perfect strong scaling if its running time on P processors is linear in 1/P...
The paper presents analysis of matrix multiplication algorithms from the point of view of their effi...
[[abstract]]We present a parallel method for matrix multiplication on distributed-memory MIMD archit...
International audienceThis paper presents a secure multiparty computation protocol for the Strassen-...
AbstractWe present a parallel method for matrix multiplication on distributed-memory MIMD architectu...
In this paper a non-recursive Strassen\u2019s matrix multiplication algorithm is presented. This new...
This paper describes parallel matrix transpose algorithms on distributed memory concurrent processor...
International audienceMapReduce is one of the most popular distributed programming paradigms that al...