Cataloged from PDF version of article.The one-dimensional decomposition of nonuniform workload arrays with optimal load balancing is investigated. The problem has been studied in the literature as the ‘‘chains-on-chains partitioning’’ problem. Despite the rich literature on exact algorithms, heuristics are still used in parallel computing community with the ‘‘hope’’ of good decompositions and the ‘‘myth’’ of exact algorithms being hard to implement and not runtime efficient. We show that exact algorithms yield significant improvements in load balance over heuristics with negligible overhead. Detailed pseudocodes of the proposed algorithms are provided for reproducibility. We start with a literature review and propose improvements and e...
Cataloged from PDF version of article.In this work, we show that the standard graph-partitioning-bas...
Sparse matrix partitioning is a common technique used for improving performance of parallel linear i...
Load imbalance in an application can lead to degradation of performance and a significant drop in sy...
The one-dimensional decomposition of nonuniform workload arrays with optimal load balancing is inves...
One-dimensional decomposition of nonuniform workload arrays for optimal load balancing is investigat...
Cataloged from PDF version of article.We study the problem of one-dimensional partitioning of nonuni...
We study the problem of one-dimensional partitioning of nonuniform workload arrays with optimal load...
Optimal load balancing in sparse matrix decomposition without disturbing the row/column ordering is ...
International audienceWe investigate one dimensional partitioning of sparse matrices under a given o...
This extended abstract presents a survey of combinatorial problems encountered in scientific computa...
To minimize the communication in parallel sparse matrix-vector multiplication while maintaining load...
Given a set of 1D intervals and a desired partition number, this paper studies on how to make an opt...
Journal ArticleLoad balancing algorithms improve a program's performance on unbalanced datasets, bu...
The scalability of sparse matrix-vector multiplication (SpMV) on distributed memory systems depends ...
In parallel computing, obtaining maximal performance is often mandatory to solve large and complex p...
Cataloged from PDF version of article.In this work, we show that the standard graph-partitioning-bas...
Sparse matrix partitioning is a common technique used for improving performance of parallel linear i...
Load imbalance in an application can lead to degradation of performance and a significant drop in sy...
The one-dimensional decomposition of nonuniform workload arrays with optimal load balancing is inves...
One-dimensional decomposition of nonuniform workload arrays for optimal load balancing is investigat...
Cataloged from PDF version of article.We study the problem of one-dimensional partitioning of nonuni...
We study the problem of one-dimensional partitioning of nonuniform workload arrays with optimal load...
Optimal load balancing in sparse matrix decomposition without disturbing the row/column ordering is ...
International audienceWe investigate one dimensional partitioning of sparse matrices under a given o...
This extended abstract presents a survey of combinatorial problems encountered in scientific computa...
To minimize the communication in parallel sparse matrix-vector multiplication while maintaining load...
Given a set of 1D intervals and a desired partition number, this paper studies on how to make an opt...
Journal ArticleLoad balancing algorithms improve a program's performance on unbalanced datasets, bu...
The scalability of sparse matrix-vector multiplication (SpMV) on distributed memory systems depends ...
In parallel computing, obtaining maximal performance is often mandatory to solve large and complex p...
Cataloged from PDF version of article.In this work, we show that the standard graph-partitioning-bas...
Sparse matrix partitioning is a common technique used for improving performance of parallel linear i...
Load imbalance in an application can lead to degradation of performance and a significant drop in sy...