On multicore architectures, the ratio of peak memory bandwidth to peak floating-point performance (byte:flop ratio) is decreasing as core counts increase, further limiting the performance of bandwidth limited applications. Multiplying a sparse matrix (as well as its transpose in the unsymmetric case) with a dense vector is the core of sparse iterative methods. In this paper, we present a new multithreaded algorithm for the symmetric case which potentially cuts the bandwidth requirements in half while exposing lots of parallelism in practice. We also give a new data structure transformation, called bit masked register blocks, which promises significant reductions on bandwidth requirements by reducing the number of indexing elements without i...
This dissertation presents an architecture to accelerate sparse matrix linear algebra,which is among...
We improve the performance of sparse matrix-vector multiply (SpMV) on modern cache-based superscalar...
Abstract—Many scientific applications involve operations on sparse matrices. However, due to irregul...
On multicore architectures, the ratio of peak memory bandwidth to peak floating-point performance (b...
The sparse matrix--vector multiplication is an important kernel, but is hard to efficiently execute ...
Sparse computations are ubiquitous in computational codes, with the sparse matrix-vector (SpMV) mult...
In earlier work we have introduced the “Recursive Sparse Blocks ” (RSB) sparse matrix storage scheme...
The problem of obtaining high computational throughput from sparse matrix multiple--vector multiplic...
We design and develop a work-efficient multithreaded algorithm for sparse matrix-sparse vector multi...
AbstractThe sparse matrix-vector multiplication (SpMV) is a fundamental kernel used in computational...
A simple and efficient algorithm for the bandwidth reduction of sparse symmetric matrices is propose...
The problem of sparse matrix bandwidth reduction is addressed and solved with two approaches suitabl...
We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as...
We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as...
AbstractSparse matrix vector multiplication (SpMV) is the dominant kernel in scientific simulations....
This dissertation presents an architecture to accelerate sparse matrix linear algebra,which is among...
We improve the performance of sparse matrix-vector multiply (SpMV) on modern cache-based superscalar...
Abstract—Many scientific applications involve operations on sparse matrices. However, due to irregul...
On multicore architectures, the ratio of peak memory bandwidth to peak floating-point performance (b...
The sparse matrix--vector multiplication is an important kernel, but is hard to efficiently execute ...
Sparse computations are ubiquitous in computational codes, with the sparse matrix-vector (SpMV) mult...
In earlier work we have introduced the “Recursive Sparse Blocks ” (RSB) sparse matrix storage scheme...
The problem of obtaining high computational throughput from sparse matrix multiple--vector multiplic...
We design and develop a work-efficient multithreaded algorithm for sparse matrix-sparse vector multi...
AbstractThe sparse matrix-vector multiplication (SpMV) is a fundamental kernel used in computational...
A simple and efficient algorithm for the bandwidth reduction of sparse symmetric matrices is propose...
The problem of sparse matrix bandwidth reduction is addressed and solved with two approaches suitabl...
We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as...
We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as...
AbstractSparse matrix vector multiplication (SpMV) is the dominant kernel in scientific simulations....
This dissertation presents an architecture to accelerate sparse matrix linear algebra,which is among...
We improve the performance of sparse matrix-vector multiply (SpMV) on modern cache-based superscalar...
Abstract—Many scientific applications involve operations on sparse matrices. However, due to irregul...