In this paper we investigate the execution of Ab and A^T b, where A is a sparse matrix and b a dense vector, using the Blocked Based Compression Storage (BBCS) scheme and an Augmented Vector Architecture (AVA). In particular, we demonstrate that by using the BBCS format, we can represent both the direct and the transposed matrix for the purposes of matrix-vector multiplication with no additional costs in storage, access time and computation performance. To achieve this, we propose a new instruction and a hardware modification for the AVA. Subsequently we evaluate the performance of the transposed Sparse Matrix Vector Multiplication (SMVM) and demonstrate that like for the direct SMVM, the BBCS scheme outperforms other general schemes like t...
The handling of the sparse matrix vector product(SMVP) is a common kernel in many scientific applica...
An important kernel of scientific software is the multiplication of a sparse matrix by a vector. The...
The problem of obtaining high computational throughput from sparse matrix multiple--vector multiplic...
In this dissertation we have identified vector processing shortcomings related to the efficient stor...
In this dissertation we have identified vector processing shortcomings related to the efficient stor...
Abstract—Many scientific applications involve operations on sparse matrices. However, due to irregul...
Abstract. Many applications based on finite element and finite difference methods include the soluti...
An important kernel of scientific software is the multiplication of a sparse matrix by a vector. The...
An important kernel of scientific software is the multiplication of a sparse matrix by a vector. The...
The irregular nature of sparse matrix-vector multiplication, Ax = y, has led to the development of a...
The handling of the sparse matrix vector product(SMVP) is a common kernel in many scientific applica...
The irregular nature of sparse matrix-vector multiplication, Ax = y, has led to the development of a...
The irregular nature of sparse matrix-vector multiplication, Ax = y, has led to the development of a...
Abstract—Sparse matrix-vector multiplication (SpM×V) has been characterized as one of the most signi...
In earlier work we have introduced the “Recursive Sparse Blocks ” (RSB) sparse matrix storage scheme...
The handling of the sparse matrix vector product(SMVP) is a common kernel in many scientific applica...
An important kernel of scientific software is the multiplication of a sparse matrix by a vector. The...
The problem of obtaining high computational throughput from sparse matrix multiple--vector multiplic...
In this dissertation we have identified vector processing shortcomings related to the efficient stor...
In this dissertation we have identified vector processing shortcomings related to the efficient stor...
Abstract—Many scientific applications involve operations on sparse matrices. However, due to irregul...
Abstract. Many applications based on finite element and finite difference methods include the soluti...
An important kernel of scientific software is the multiplication of a sparse matrix by a vector. The...
An important kernel of scientific software is the multiplication of a sparse matrix by a vector. The...
The irregular nature of sparse matrix-vector multiplication, Ax = y, has led to the development of a...
The handling of the sparse matrix vector product(SMVP) is a common kernel in many scientific applica...
The irregular nature of sparse matrix-vector multiplication, Ax = y, has led to the development of a...
The irregular nature of sparse matrix-vector multiplication, Ax = y, has led to the development of a...
Abstract—Sparse matrix-vector multiplication (SpM×V) has been characterized as one of the most signi...
In earlier work we have introduced the “Recursive Sparse Blocks ” (RSB) sparse matrix storage scheme...
The handling of the sparse matrix vector product(SMVP) is a common kernel in many scientific applica...
An important kernel of scientific software is the multiplication of a sparse matrix by a vector. The...
The problem of obtaining high computational throughput from sparse matrix multiple--vector multiplic...