In this work, we consider the reformulation of hierarchical ($\mathcal{H}$) matrix algorithms for many-core processors with a model implementation on graphics processing units (GPUs). $\mathcal{H}$ matrices approximate specific dense matrices, e.g., from discretized integral equations or kernel ridge regression, leading to log-linear time complexity in dense matrix-vector products. The parallelization of $\mathcal{H}$ matrix operations on many-core processors is difficult due to the complex nature of the underlying algorithms. While previous algorithmic advances for many-core hardware focused on accelerating existing $\mathcal{H}$ matrix CPU implementations by many-core processors, we here aim at totally relying on that processor ty...
We present memory-efficient and scalable algorithms for kernel methods used in machine learning. Usi...
The high performance computing (HPC) community is obsessed over the general matrix-matrix multiply (...
This file contains the HiCMA library and scripts for reproducing the results presented in the Euro-P...
In this work, we consider the reformulation of hierarchical (H) matrix algorithms for many-core proc...
Hierarchical matrix (H-matrix) techniques can be used to efficiently treat dense matrices. With an H...
Many matrices in scientific computing, statistical inference, and machine learning exhibit sparse an...
International audienceHierarchical matrices (H-matrices) have become important in applications where...
H-matrices offer log-linear storage and computations costs, thanks to a controlled accuracy loss. Th...
Abstract: Few realize that, for large matrices, many dense matrix computations achieve nearly the sa...
In this paper, we describe and evaluate an extension of the Chameleon library to operate with hierar...
The objective of high performance computing (HPC) is to ensure that the computational power of hardw...
In this paper we review the technique of hierarchical matrices and put it into the context of black-...
Hierarchically semiseparable (HSS) matrix algorithms are emerging techniques in constructing the sup...
Compression techniques have revolutionized the Boundary Element Method used to solve the Maxwell equ...
A number of parallel formulations of dense matrix multiplication algorithm have been developed. For ...
We present memory-efficient and scalable algorithms for kernel methods used in machine learning. Usi...
The high performance computing (HPC) community is obsessed over the general matrix-matrix multiply (...
This file contains the HiCMA library and scripts for reproducing the results presented in the Euro-P...
In this work, we consider the reformulation of hierarchical (H) matrix algorithms for many-core proc...
Hierarchical matrix (H-matrix) techniques can be used to efficiently treat dense matrices. With an H...
Many matrices in scientific computing, statistical inference, and machine learning exhibit sparse an...
International audienceHierarchical matrices (H-matrices) have become important in applications where...
H-matrices offer log-linear storage and computations costs, thanks to a controlled accuracy loss. Th...
Abstract: Few realize that, for large matrices, many dense matrix computations achieve nearly the sa...
In this paper, we describe and evaluate an extension of the Chameleon library to operate with hierar...
The objective of high performance computing (HPC) is to ensure that the computational power of hardw...
In this paper we review the technique of hierarchical matrices and put it into the context of black-...
Hierarchically semiseparable (HSS) matrix algorithms are emerging techniques in constructing the sup...
Compression techniques have revolutionized the Boundary Element Method used to solve the Maxwell equ...
A number of parallel formulations of dense matrix multiplication algorithm have been developed. For ...
We present memory-efficient and scalable algorithms for kernel methods used in machine learning. Usi...
The high performance computing (HPC) community is obsessed over the general matrix-matrix multiply (...
This file contains the HiCMA library and scripts for reproducing the results presented in the Euro-P...