We address the parallelization of the LU factorization of hierarchical matrices (-matrices) arising from boundary element methods. Our approach exploits task-parallelism via the OmpSs programming model and runtime, which discovers the data-flow parallelism intrinsic to the operation at execution time, via the analysis of data dependencies based on the memory addresses of the tasks’ operands. This is especially challenging for H-matrices, as the structures containing the data vary in dimension during the execution. We tackle this issue by decoupling the data structure from that used to detect dependencies. Furthermore, we leverage the support for weak operands and early release of dependencies, recently introduced in OmpSs-2, to accelerate t...
Processors with large numbers of cores are becoming commonplace. In order to take advantage of the a...
With the emergence of thread-level parallelism as the primary means for continued improvement of per...
Abstract—Processors with large numbers of cores are becom-ing commonplace. In order to take advantag...
We address the parallelization of the LU factorization of hierarchical matrices (-matrices) arising ...
H-matrices offer log-linear storage and computations costs, thanks to a controlled accuracy loss. Th...
International audienceHierarchical matrices (H-matrices) have become important in applications where...
A version of the H-LU factorization is introduced, based on the individual compu-tational tasks occu...
In this paper, we investigate how to exploit task-parallelism during the execution of the Cholesky ...
We investigate a parallelization strategy for dense matrix factorization (DMF) algorithms, using Ope...
Compression techniques have revolutionized the Boundary Element Method used to solve the Maxwell equ...
In this paper we present several improvements of widely used parallel LU factorization methods on sp...
Colloque avec actes et comité de lecture. internationale.International audienceIn this paper we pres...
Due to the evolution of massively parallel computers towards deeper levels of parallelism and memory...
In this study, we evaluate two task frameworks with dependencies for important application kernels c...
International audienceAs multicore systems continue to gain ground in the high performance computing...
Processors with large numbers of cores are becoming commonplace. In order to take advantage of the a...
With the emergence of thread-level parallelism as the primary means for continued improvement of per...
Abstract—Processors with large numbers of cores are becom-ing commonplace. In order to take advantag...
We address the parallelization of the LU factorization of hierarchical matrices (-matrices) arising ...
H-matrices offer log-linear storage and computations costs, thanks to a controlled accuracy loss. Th...
International audienceHierarchical matrices (H-matrices) have become important in applications where...
A version of the H-LU factorization is introduced, based on the individual compu-tational tasks occu...
In this paper, we investigate how to exploit task-parallelism during the execution of the Cholesky ...
We investigate a parallelization strategy for dense matrix factorization (DMF) algorithms, using Ope...
Compression techniques have revolutionized the Boundary Element Method used to solve the Maxwell equ...
In this paper we present several improvements of widely used parallel LU factorization methods on sp...
Colloque avec actes et comité de lecture. internationale.International audienceIn this paper we pres...
Due to the evolution of massively parallel computers towards deeper levels of parallelism and memory...
In this study, we evaluate two task frameworks with dependencies for important application kernels c...
International audienceAs multicore systems continue to gain ground in the high performance computing...
Processors with large numbers of cores are becoming commonplace. In order to take advantage of the a...
With the emergence of thread-level parallelism as the primary means for continued improvement of per...
Abstract—Processors with large numbers of cores are becom-ing commonplace. In order to take advantag...