While much work has been devoted to the study of cache behavior during the execution of codes with regular access patterns, little attention has been paid to irregular codes. An important portion of these codes are scientific applications that handle compressed sparse matrices. In this work a probabilistic model for the prediction of the number of misses on a K-way associative cache memory considering sparse matrices with a uniform or banded distribution is presented. Two different irregular kernels are considered: the sparse matrix-vector product and the transposition of a sparse matrix. The model was validated with simulations on synthetic uniform matrices and banded matrices from the Harwell-Boeing collection. Keywords: Sparse matrix, ir...
In this article, we introduce a cache-oblivious method for sparse matrix–vector multiplication. Our ...
Blocking is a well-known optimization technique for improving the effectiveness of memory hierarchie...
Matrix transposition is a fundamental operation, but it may present a very low and hardly predictabl...
A probabilistic model to estimate the number of misses on a set associative cache with an LRU replac...
Sparse matrices are in the kernel of numerical applications. Their compressed storage, which permits...
. Many scientific applications handle compressed sparse matrices. Cache behavior during the executio...
Algorithms for the sparse matrix-vector multiplication (shortly SpMxV) are important building blocks...
While there are many studies on the locality of dense codes, few deal with the locality of sparse co...
In previous work it was found that cache blocking of sparse matrix vector multiplication yielded sig...
We consider the problem of building high-performance implementations of sparse matrix-vector multipl...
Abstract. We present new performance models and more compact data structures for cache blocking when...
Cache behavior is complex and inherently unstable, yet it is a critical factor affecting program per...
We present new performance models and a new, more compact data structure for cache blocking when ap...
AbstractIn this paper we construct an analytic model of cache misses during matrix multiplication. T...
AbstractSparse scientific codes face grave performance challenges as memory bandwidth limitations gr...
In this article, we introduce a cache-oblivious method for sparse matrix–vector multiplication. Our ...
Blocking is a well-known optimization technique for improving the effectiveness of memory hierarchie...
Matrix transposition is a fundamental operation, but it may present a very low and hardly predictabl...
A probabilistic model to estimate the number of misses on a set associative cache with an LRU replac...
Sparse matrices are in the kernel of numerical applications. Their compressed storage, which permits...
. Many scientific applications handle compressed sparse matrices. Cache behavior during the executio...
Algorithms for the sparse matrix-vector multiplication (shortly SpMxV) are important building blocks...
While there are many studies on the locality of dense codes, few deal with the locality of sparse co...
In previous work it was found that cache blocking of sparse matrix vector multiplication yielded sig...
We consider the problem of building high-performance implementations of sparse matrix-vector multipl...
Abstract. We present new performance models and more compact data structures for cache blocking when...
Cache behavior is complex and inherently unstable, yet it is a critical factor affecting program per...
We present new performance models and a new, more compact data structure for cache blocking when ap...
AbstractIn this paper we construct an analytic model of cache misses during matrix multiplication. T...
AbstractSparse scientific codes face grave performance challenges as memory bandwidth limitations gr...
In this article, we introduce a cache-oblivious method for sparse matrix–vector multiplication. Our ...
Blocking is a well-known optimization technique for improving the effectiveness of memory hierarchie...
Matrix transposition is a fundamental operation, but it may present a very low and hardly predictabl...