The one-dimensional decomposition of nonuniform workload arrays with optimal load balancing is investigated. The problem has been studied in the literature as the "chains-on-chains partitioning" problem. Despite the rich literature on exact algorithms, heuristics are still used in parallel computing community with the "hope" of good decompositions and the "myth" of exact algorithms being hard to implement and not runtime efficient. We show that exact algorithms yield significant improvements in load balance over heuristics with negligible overhead. Detailed pseudocodes of the proposed algorithms are provided for reproducibility. We start with a literature review and propose improvements and efficient implementation tips for these algorithms...
A method is outlined for optimising graph partitions which arise in mapping unstructured mesh calcul...
Sparse matrix partitioning is a common technique used for improving performance of parallel linear i...
The sparse matrix partitioning problem arises when minimizing communication in parallel sparse matri...
Cataloged from PDF version of article.The one-dimensional decomposition of nonuniform workload array...
One-dimensional decomposition of nonuniform workload arrays for optimal load balancing is investigat...
We study the problem of one-dimensional partitioning of nonuniform workload arrays, with optimal loa...
Optimal load balancing in sparse matrix decomposition without disturbing the row/column ordering is ...
We study the problem of one-dimensional partitioning of nonuniform workload arrays with optimal load...
To minimize the communication in parallel sparse matrix-vector multiplication while maintaining load...
This extended abstract presents a survey of combinatorial problems encountered in scientific computa...
International audienceWe investigate one dimensional partitioning of sparse matrices under a given o...
Given a set of 1D intervals and a desired partition number, this paper studies on how to make an opt...
Twelve adaptive image-space decomposition algorithms are presented for sort-first parallel direct vo...
Journal ArticleLoad balancing algorithms improve a program's performance on unbalanced datasets, bu...
In parallel computing, obtaining maximal performance is often mandatory to solve large and complex p...
A method is outlined for optimising graph partitions which arise in mapping unstructured mesh calcul...
Sparse matrix partitioning is a common technique used for improving performance of parallel linear i...
The sparse matrix partitioning problem arises when minimizing communication in parallel sparse matri...
Cataloged from PDF version of article.The one-dimensional decomposition of nonuniform workload array...
One-dimensional decomposition of nonuniform workload arrays for optimal load balancing is investigat...
We study the problem of one-dimensional partitioning of nonuniform workload arrays, with optimal loa...
Optimal load balancing in sparse matrix decomposition without disturbing the row/column ordering is ...
We study the problem of one-dimensional partitioning of nonuniform workload arrays with optimal load...
To minimize the communication in parallel sparse matrix-vector multiplication while maintaining load...
This extended abstract presents a survey of combinatorial problems encountered in scientific computa...
International audienceWe investigate one dimensional partitioning of sparse matrices under a given o...
Given a set of 1D intervals and a desired partition number, this paper studies on how to make an opt...
Twelve adaptive image-space decomposition algorithms are presented for sort-first parallel direct vo...
Journal ArticleLoad balancing algorithms improve a program's performance on unbalanced datasets, bu...
In parallel computing, obtaining maximal performance is often mandatory to solve large and complex p...
A method is outlined for optimising graph partitions which arise in mapping unstructured mesh calcul...
Sparse matrix partitioning is a common technique used for improving performance of parallel linear i...
The sparse matrix partitioning problem arises when minimizing communication in parallel sparse matri...