When Cache Blocking of Sparse Matrix Vector Multiply Works and Why

Rajesh Nishtala
Richard Vuduc
James W. Demmel
Katherine A. Yelick

Publication date

January 2004

Abstract

We present new performance models and a new, more compact data structure for cache blocking when applied to the sparse matrixvector multiply (SpMV) operation, y # y +A x. Prior work indicates that cache blocked SpMV performs very well for some matrix and machine combinations, yielding speedups as high as 3x. We look at the general question of when and why performance improves, finding that cache blocking is most e#ective when simultaneously 1) x does not fit in cache, 2) y fits in cache, 3) the non-zeros are distributed throughout the matrix, and 4) the non-zero density is su#ciently high. We extend our prior performance models, which bounded performance by assuming x and y fit in cache, to consider these classes of matrices. Unl...

Extracted data

We use cookies to provide a better user experience.

Data Protection

When Cache Blocking of Sparse Matrix Vector Multiply Works and Why

Abstract

Extracted data

When Cache Blocking of Sparse Matrix Vector Multiply Works and Why

Abstract

Extracted data

Related items

Related items