Abstract. We present new performance models and more compact data structures for cache blocking when applied to sparse matrix-vector multiply (SpM×V). We extend our prior models by relaxing the assumption that the vectors fit in cache and find that the new models are accurate enough to predict optimum block sizes. In addition, we determine criteria that predict when cache blocking improves performance. We conclude with architectural suggestions that would make memory systems execute SpM×V faster.
In this article, we introduce a cache-oblivious method for sparse matrix–vector multiplication. Our ...
Sparse matrices are in the kernel of numerical applications. Their compressed storage, which permits...
In this article, we introduce a cache-oblivious method for sparse matrix–vector multiplication. Our ...
Abstract We present new performance models and more compact data structures for cache blocking when ...
We present new performance models and a new, more compact data structure for cache blocking when ap...
We consider the problem of building high-performance implementations of sparse matrix-vector multipl...
Algorithms for the sparse matrix-vector multiplication (shortly SpMxV) are important building blocks...
While there are many studies on the locality of dense codes, few deal with the locality of sparse co...
While there are many studies on the locality of dense codes, few deal with the locality of sparse co...
In previous work it was found that cache blocking of sparse matrix vector multiplication yielded sig...
In this thesis we introduce a cost measure to compare the cache- friendliness of different permutati...
While there are many studies on the locality of dense codes, few deal with the locality of sparse co...
Blocking is a well-known optimization technique for improving the effectiveness of memory hierarchie...
Sparse matrix-vector multiplication (shortly SpMV) is one of most common subroutines in the numerica...
. Many scientific applications handle compressed sparse matrices. Cache behavior during the executio...
In this article, we introduce a cache-oblivious method for sparse matrix–vector multiplication. Our ...
Sparse matrices are in the kernel of numerical applications. Their compressed storage, which permits...
In this article, we introduce a cache-oblivious method for sparse matrix–vector multiplication. Our ...
Abstract We present new performance models and more compact data structures for cache blocking when ...
We present new performance models and a new, more compact data structure for cache blocking when ap...
We consider the problem of building high-performance implementations of sparse matrix-vector multipl...
Algorithms for the sparse matrix-vector multiplication (shortly SpMxV) are important building blocks...
While there are many studies on the locality of dense codes, few deal with the locality of sparse co...
While there are many studies on the locality of dense codes, few deal with the locality of sparse co...
In previous work it was found that cache blocking of sparse matrix vector multiplication yielded sig...
In this thesis we introduce a cost measure to compare the cache- friendliness of different permutati...
While there are many studies on the locality of dense codes, few deal with the locality of sparse co...
Blocking is a well-known optimization technique for improving the effectiveness of memory hierarchie...
Sparse matrix-vector multiplication (shortly SpMV) is one of most common subroutines in the numerica...
. Many scientific applications handle compressed sparse matrices. Cache behavior during the executio...
In this article, we introduce a cache-oblivious method for sparse matrix–vector multiplication. Our ...
Sparse matrices are in the kernel of numerical applications. Their compressed storage, which permits...
In this article, we introduce a cache-oblivious method for sparse matrix–vector multiplication. Our ...