Abstract—Many sparse matrix computations can be speeded up if the matrix is first reordered. Reordering was originally developed for direct methods but it has recently become popular for improving the cache locality of parallel iterative solvers since reordering the matrix to reduce bandwidth and wavefront can improve the locality of reference of sparse matrix-vector multiplication (SpMV), the key kernel in iterative solvers. In this paper, we present the first parallel implementations of two widely used reordering algorithms: Reverse Cuthill-McKee (RCM) and Sloan. On 16 cores of the Stampede supercomputer, our parallel RCM is 5.56 times faster on the average than a state-of-the-art sequential implementation of RCM in the HSL library. Sloan...
On multicore architectures, the ratio of peak memory bandwidth to peak floating-point performance (b...
This paper presents a compiler and runtime framework for parallelizing sparse matrix computations th...
This whitepaper addresses applicability of the MapReduce paradigm for scientific computing by realiz...
The thesis introduces a cache-oblivious method for the sparse matrix-vector (SpMV) multiplication, w...
Sparse matrix-vector multiplication (shortly SpMV) is one of most common subroutines in the numerica...
The ordering of large sparse symmetric matrices for small pro"le and wavefront or for small ban...
The paper "Bringing Order to Sparsity: A Sparse Matrix Reordering Study on Multicore CPUs" compares ...
It is well-known that reordering techniques applied to sparse matrices are common strategies to impr...
The paper "Bringing Order to Sparsity: A Sparse Matrix Reordering Study on Multicore CPUs" compares ...
Abstract. Computer simulations of realistic applications usually require solving a set of non-linear...
Gary Kumfert and Alex Pothen have improved the quality and run time of two ordering algorithms for m...
Sparse matrix-matrix multiplication (SpMM) is a key operation in numerous ar- eas from information ...
The paper "Bringing Order to Sparsity: A Sparse Matrix Reordering Study on Multicore CPUs" compares ...
We present implementation details of a reordering strategy for permuting elements whose absolute val...
A simple and efficient algorithm for the bandwidth reduction of sparse symmetric matrices is propose...
On multicore architectures, the ratio of peak memory bandwidth to peak floating-point performance (b...
This paper presents a compiler and runtime framework for parallelizing sparse matrix computations th...
This whitepaper addresses applicability of the MapReduce paradigm for scientific computing by realiz...
The thesis introduces a cache-oblivious method for the sparse matrix-vector (SpMV) multiplication, w...
Sparse matrix-vector multiplication (shortly SpMV) is one of most common subroutines in the numerica...
The ordering of large sparse symmetric matrices for small pro"le and wavefront or for small ban...
The paper "Bringing Order to Sparsity: A Sparse Matrix Reordering Study on Multicore CPUs" compares ...
It is well-known that reordering techniques applied to sparse matrices are common strategies to impr...
The paper "Bringing Order to Sparsity: A Sparse Matrix Reordering Study on Multicore CPUs" compares ...
Abstract. Computer simulations of realistic applications usually require solving a set of non-linear...
Gary Kumfert and Alex Pothen have improved the quality and run time of two ordering algorithms for m...
Sparse matrix-matrix multiplication (SpMM) is a key operation in numerous ar- eas from information ...
The paper "Bringing Order to Sparsity: A Sparse Matrix Reordering Study on Multicore CPUs" compares ...
We present implementation details of a reordering strategy for permuting elements whose absolute val...
A simple and efficient algorithm for the bandwidth reduction of sparse symmetric matrices is propose...
On multicore architectures, the ratio of peak memory bandwidth to peak floating-point performance (b...
This paper presents a compiler and runtime framework for parallelizing sparse matrix computations th...
This whitepaper addresses applicability of the MapReduce paradigm for scientific computing by realiz...