For outer-product-parallel sparse matrix-matrix multiplication (SpGEMM) of the form C=A×B, we propose three hypergraph models that achieve simultaneous partitioning of input and output matrices without any replication of input data. All three hypergraph models perform conformable one-dimensional (1D) columnwise and 1D rowwise partitioning of the input matrices A and B, respectively. The first hypergraph model performs two-dimensional (2D) nonzero-based partitioning of the output matrix, whereas the second and third models perform 1D rowwise and 1D columnwise partitioning of the output matrix, respectively. This partitioning scheme induces a two-phase parallel SpGEMM algorithm, where communication-free local SpGEMM computations constitute th...
We consider two-dimensional partitioning of general sparse matrices for parallel sparse matrix-vecto...
We propose a comprehensive and generic framework to minimize multiple and different volume-based com...
Exploiting spatial and temporal localities is investigated for efficient row-by-row parallelization ...
Cataloged from PDF version of article.FFor outer-product-parallel sparse matrix-matrix multiplicatio...
We investigate outer-product--parallel, inner-product--parallel, and row-by-row-product--parallel fo...
Cataloged from PDF version of thesis.Includes bibliographical references (leaves 102-107).Thesis (Ph...
Abstract. Generalized sparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many hi...
Abstract. Sparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many high-performan...
International audienceThere are three common parallel sparse matrix-vector multiply algorithms: 1D 3...
We provide an exposition of hypergraph models for parallelizing sparse matrix-vector multiplies. Our...
In this work, we show that the standard graph-partitioning based decomposition of sparse matrices do...
International audienceWe investigate one dimensional partitioning of sparse matrices under a given o...
We propose a new two-phase method for the coarse-grain decomposition of irregular computational doma...
We consider two-dimensional partitioning of general sparse matrices for parallel sparse matrix-vecto...
International audienceSparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many hi...
We consider two-dimensional partitioning of general sparse matrices for parallel sparse matrix-vecto...
We propose a comprehensive and generic framework to minimize multiple and different volume-based com...
Exploiting spatial and temporal localities is investigated for efficient row-by-row parallelization ...
Cataloged from PDF version of article.FFor outer-product-parallel sparse matrix-matrix multiplicatio...
We investigate outer-product--parallel, inner-product--parallel, and row-by-row-product--parallel fo...
Cataloged from PDF version of thesis.Includes bibliographical references (leaves 102-107).Thesis (Ph...
Abstract. Generalized sparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many hi...
Abstract. Sparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many high-performan...
International audienceThere are three common parallel sparse matrix-vector multiply algorithms: 1D 3...
We provide an exposition of hypergraph models for parallelizing sparse matrix-vector multiplies. Our...
In this work, we show that the standard graph-partitioning based decomposition of sparse matrices do...
International audienceWe investigate one dimensional partitioning of sparse matrices under a given o...
We propose a new two-phase method for the coarse-grain decomposition of irregular computational doma...
We consider two-dimensional partitioning of general sparse matrices for parallel sparse matrix-vecto...
International audienceSparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many hi...
We consider two-dimensional partitioning of general sparse matrices for parallel sparse matrix-vecto...
We propose a comprehensive and generic framework to minimize multiple and different volume-based com...
Exploiting spatial and temporal localities is investigated for efficient row-by-row parallelization ...