Abstract. We consider the realization of matrix-matrix multiplication and propose a hierarchical algorithm implemented in a task-parallel way using multiprocessor tasks on distributed memory. The algorithm has been designed to minimize the communication overhead while showing large locality of memory references. The task-parallel realization makes the algorithm especially suited for cluster of SMPs since tasks can then be mapped to the different cluster nodes in order to efficiently exploit the cluster architecture. Experiments on current cluster machines show that the resulting execution times are competitive with state-of-the-art methods like PDGEMM.
This paper describes a novel parallel algorithm that implements a dense matrix multiplication operat...
Parallel sparse matrix-matrix multiplication algorithms (PSpGEMM) spend most of their running time o...
In this paper, we address the issue of implementing matrix-matrix multiplication on heterogeneous pl...
Matrix-matrix multiplication is one of the core computations in many algorithms from scientific comp...
Matrix multiplication is one of the important operations in scientific and engineering application. ...
The multiplication of a vector by a matrix is the kernel operation in many algorithms used in scient...
A novel parallel algorithm for matrix multiplication is presented. It is based on a 1-D hyper-systol...
International audienceTask-based programming models have succeeded in gaining the interest of the hi...
Hierarchical matrix (H-matrix) techniques can be used to efficiently treat dense matrices. With an H...
The multiplication of large spare matrices is a basic operation for many scientific and engineering ...
Proceedings of the 8th IEEE International Conference on Cluster Computing (Cluster 2006), October, 2...
Matrix multiplication is one of the important operations in scientific and engineering application. ...
The algorithm of multiplication of matrices of Dekel, Nassimi and Sahani or Hypercube is analysed, m...
We present a new fast and scalable matrix multiplication algorithm, called DIMMA (Distribution-Indep...
A number of parallel formulations of dense matrix multiplication algorithm have been developed. For ...
This paper describes a novel parallel algorithm that implements a dense matrix multiplication operat...
Parallel sparse matrix-matrix multiplication algorithms (PSpGEMM) spend most of their running time o...
In this paper, we address the issue of implementing matrix-matrix multiplication on heterogeneous pl...
Matrix-matrix multiplication is one of the core computations in many algorithms from scientific comp...
Matrix multiplication is one of the important operations in scientific and engineering application. ...
The multiplication of a vector by a matrix is the kernel operation in many algorithms used in scient...
A novel parallel algorithm for matrix multiplication is presented. It is based on a 1-D hyper-systol...
International audienceTask-based programming models have succeeded in gaining the interest of the hi...
Hierarchical matrix (H-matrix) techniques can be used to efficiently treat dense matrices. With an H...
The multiplication of large spare matrices is a basic operation for many scientific and engineering ...
Proceedings of the 8th IEEE International Conference on Cluster Computing (Cluster 2006), October, 2...
Matrix multiplication is one of the important operations in scientific and engineering application. ...
The algorithm of multiplication of matrices of Dekel, Nassimi and Sahani or Hypercube is analysed, m...
We present a new fast and scalable matrix multiplication algorithm, called DIMMA (Distribution-Indep...
A number of parallel formulations of dense matrix multiplication algorithm have been developed. For ...
This paper describes a novel parallel algorithm that implements a dense matrix multiplication operat...
Parallel sparse matrix-matrix multiplication algorithms (PSpGEMM) spend most of their running time o...
In this paper, we address the issue of implementing matrix-matrix multiplication on heterogeneous pl...