In this work, we consider the reformulation of hierarchical (H) matrix algorithms for many-core processors with a model implementation on graphics processing units (GPUs). H matrices approximate specific dense matrices, e.g., from discretized integral equations or kernel ridge regression, leading to log-linear time complexity in dense matrix-vector products. The parallelization of H matrix operations on many-core processors is difficult due to the complex nature of the underlying algorithms. While previous algorithmic advances for many-core hardware focused on accelerating existing H matrix CPU implementations by many-core processors, we here aim at totally relying on that processor type. As main contribution, we introduce the necessary par...
In this paper, we describe and evaluate an extension of the Chameleon library to operate with hierar...
Matrix-matrix multiplication is one of the core computations in many algorithms from scientific comp...
In a previous PPoPP paper we showed how the FLAME method-ology, combined with the SuperMatrix runtim...
In this work, we consider the reformulation of hierarchical ($\mathcal{H}$) matrix algorithm...
Hierarchical matrix (H-matrix) techniques can be used to efficiently treat dense matrices. With an H...
Many matrices in scientific computing, statistical inference, and machine learning exhibit sparse an...
International audienceHierarchical matrices (H-matrices) have become important in applications where...
Abstract: Few realize that, for large matrices, many dense matrix computations achieve nearly the sa...
We present a new approach to utilizing all CPU cores and all GPUs on heterogeneous multicore and mul...
A number of parallel formulations of dense matrix multiplication algorithm have been developed. For ...
Abstract. We consider the realization of matrix-matrix multiplication and propose a hierarchical alg...
In this document, we describe two strategies of distribution of computations that can be used to imp...
Hierarchically semiseparable (HSS) matrix algorithms are emerging techniques in constructing the sup...
H-matrices offer log-linear storage and computations costs, thanks to a controlled accuracy loss. Th...
Matrix Factorization (MF) has been widely applied in machine learning and data mining. Due to the la...
In this paper, we describe and evaluate an extension of the Chameleon library to operate with hierar...
Matrix-matrix multiplication is one of the core computations in many algorithms from scientific comp...
In a previous PPoPP paper we showed how the FLAME method-ology, combined with the SuperMatrix runtim...
In this work, we consider the reformulation of hierarchical ($\mathcal{H}$) matrix algorithm...
Hierarchical matrix (H-matrix) techniques can be used to efficiently treat dense matrices. With an H...
Many matrices in scientific computing, statistical inference, and machine learning exhibit sparse an...
International audienceHierarchical matrices (H-matrices) have become important in applications where...
Abstract: Few realize that, for large matrices, many dense matrix computations achieve nearly the sa...
We present a new approach to utilizing all CPU cores and all GPUs on heterogeneous multicore and mul...
A number of parallel formulations of dense matrix multiplication algorithm have been developed. For ...
Abstract. We consider the realization of matrix-matrix multiplication and propose a hierarchical alg...
In this document, we describe two strategies of distribution of computations that can be used to imp...
Hierarchically semiseparable (HSS) matrix algorithms are emerging techniques in constructing the sup...
H-matrices offer log-linear storage and computations costs, thanks to a controlled accuracy loss. Th...
Matrix Factorization (MF) has been widely applied in machine learning and data mining. Due to the la...
In this paper, we describe and evaluate an extension of the Chameleon library to operate with hierar...
Matrix-matrix multiplication is one of the core computations in many algorithms from scientific comp...
In a previous PPoPP paper we showed how the FLAME method-ology, combined with the SuperMatrix runtim...