Data movements between different levels of the memory hierarchy (I/O-transitions, or simply I/O s) are a critical performance bottleneck in modern computing. Therefore it is a problem of high practical relevance to find algorithms that use a minimal number of I/O s. We present a cache-oblivious sparse matrix-sparse matrix multiplication algorithm that uses a worst-case number of I/O s that matches a previously established lower bound for this problem (0 (N2/B.M) read-I/Os and 0 (N2/B) write-I/Os, where N is the size of the problem instance, M is the size of the fast memory and B is the size of the cache lines). When the output does not need to be stored, also the number of write-I/Os can be reduced to 0 (N2/B.M). This improves the worst-cas...
The thesis introduces a cache-oblivious method for the sparse matrix-vector (SpMV) multiplication, w...
In this work, we study the cache-oblivious computation model, which is inspired by the behaviour of ...
This paper initiates the study of I/O algorithms (minimizing cache misses) from the perspective of f...
Data movements between different levels of the memory hierarchy (I/O-transitions, or simply I/Os) ar...
In this article, we introduce a cache-oblivious method for sparse matrix–vector multiplication. Our ...
The sparse matrix–vector (SpMV) multiplication is an important kernel in many applications. When the...
The sparse matrix–vector (SpMV) multiplication is an important kernel in many applications. When the...
In this thesis we introduce a cost measure to compare the cache- friendliness of different permutati...
© Erik D. Demaine, Andrea Lincoln, Quanquan C. Liu, Jayson Lynch, and Virginia Vassilevska Williams....
AbstractOne of the keys to tap the full performance potential of current hardware is the optimal uti...
Let X[0 . . n - 1] and Y[0 . . m - 1] be two sorted arrays, and define the m x n matrix A by A[j][i]...
This report deals with the ecient calculation of matrix-matrix multiplication, without using explici...
In this paper we explore a simple and general approach for developing parallel algorithms that lead ...
Abstract This paper presents asymptotically optimal algo-rithms for rectangular matrix transpose, FF...
We analyze the problem of sparse-matrix dense-vector mul-tiplication (SpMV) in the I/O-model. The ta...
The thesis introduces a cache-oblivious method for the sparse matrix-vector (SpMV) multiplication, w...
In this work, we study the cache-oblivious computation model, which is inspired by the behaviour of ...
This paper initiates the study of I/O algorithms (minimizing cache misses) from the perspective of f...
Data movements between different levels of the memory hierarchy (I/O-transitions, or simply I/Os) ar...
In this article, we introduce a cache-oblivious method for sparse matrix–vector multiplication. Our ...
The sparse matrix–vector (SpMV) multiplication is an important kernel in many applications. When the...
The sparse matrix–vector (SpMV) multiplication is an important kernel in many applications. When the...
In this thesis we introduce a cost measure to compare the cache- friendliness of different permutati...
© Erik D. Demaine, Andrea Lincoln, Quanquan C. Liu, Jayson Lynch, and Virginia Vassilevska Williams....
AbstractOne of the keys to tap the full performance potential of current hardware is the optimal uti...
Let X[0 . . n - 1] and Y[0 . . m - 1] be two sorted arrays, and define the m x n matrix A by A[j][i]...
This report deals with the ecient calculation of matrix-matrix multiplication, without using explici...
In this paper we explore a simple and general approach for developing parallel algorithms that lead ...
Abstract This paper presents asymptotically optimal algo-rithms for rectangular matrix transpose, FF...
We analyze the problem of sparse-matrix dense-vector mul-tiplication (SpMV) in the I/O-model. The ta...
The thesis introduces a cache-oblivious method for the sparse matrix-vector (SpMV) multiplication, w...
In this work, we study the cache-oblivious computation model, which is inspired by the behaviour of ...
This paper initiates the study of I/O algorithms (minimizing cache misses) from the perspective of f...