This paper presents asymptotically equal lower and upper bounds for the number of parallel I/O operations required to perform bit-matrix-multiply/complement (BMMC) permutations on the Parallel Disk Model proposed by Vitter and Shriver. A BMMC permutation maps a source index to a target index by an affine transformation over GF(2), where the source and target indices are treated as bit vectors. The class of BMMC permutations includes many common permutations, such as matrix transposition (when dimensions are powers of 2), bit-reversal permutations, vector-reversal permutations, hypercube permutations, matrix reblocking, Gray-code permutations, and inverse Gray-code permutations. The upper bound improves upon the asymptotic bound in the previ...
Some level-2 and level-3 Distributed Basic Linear Algebra Subroutines (DBLAS) that have been impleme...
It was conjectured that a permutation matrix with bandwidth b can be written as a product of no more...
The most efficient way to calculate strong bisimilarity is by finding the relational coarsest partit...
This paper presents asymptotically equal lower and upper bounds for the number of parallel I/O opera...
We give asymptotically equal lower and upper bounds for the number of parallel I/O operations requir...
The ability to perform permutations of large data sets in place reduces the amount of necessary avai...
The ability to perform permutations of large data sets in place reduces the amount of necessary avai...
Increasingly, modern computing problems, including many scientific and business applications, requir...
This paper presents an architecture-independent method for performing BMMC permutations on multiproc...
Optimal usage of the memory system is a key element of fast GPU algorithms. Unfortunately many commo...
The authors implemented and measured several methods to perform BMMC permutations on the MasPar MP-2...
In a generalized shuffle permutation an address (a[q-1]a[1-2]...a[0]) receives its content from an a...
International audienceWe tackle the feasibility and efficiency of two new parallel algorithms that s...
AbstractPermuting a vector is a fundamental primitive which arises in many applications. In particul...
We present a linear algebraic formulation for a class of index transformations such as Gray code enc...
Some level-2 and level-3 Distributed Basic Linear Algebra Subroutines (DBLAS) that have been impleme...
It was conjectured that a permutation matrix with bandwidth b can be written as a product of no more...
The most efficient way to calculate strong bisimilarity is by finding the relational coarsest partit...
This paper presents asymptotically equal lower and upper bounds for the number of parallel I/O opera...
We give asymptotically equal lower and upper bounds for the number of parallel I/O operations requir...
The ability to perform permutations of large data sets in place reduces the amount of necessary avai...
The ability to perform permutations of large data sets in place reduces the amount of necessary avai...
Increasingly, modern computing problems, including many scientific and business applications, requir...
This paper presents an architecture-independent method for performing BMMC permutations on multiproc...
Optimal usage of the memory system is a key element of fast GPU algorithms. Unfortunately many commo...
The authors implemented and measured several methods to perform BMMC permutations on the MasPar MP-2...
In a generalized shuffle permutation an address (a[q-1]a[1-2]...a[0]) receives its content from an a...
International audienceWe tackle the feasibility and efficiency of two new parallel algorithms that s...
AbstractPermuting a vector is a fundamental primitive which arises in many applications. In particul...
We present a linear algebraic formulation for a class of index transformations such as Gray code enc...
Some level-2 and level-3 Distributed Basic Linear Algebra Subroutines (DBLAS) that have been impleme...
It was conjectured that a permutation matrix with bandwidth b can be written as a product of no more...
The most efficient way to calculate strong bisimilarity is by finding the relational coarsest partit...