This thesis presents a novel algorithm for Transposing Rectangular matrices In-place and in Parallel (TRIP) including a proof of correctness and an analysis of work, span and parallelism. After almost 60 years since its introduction, the problem of in-place rectangular matrix transposition still does not have a satisfying solution. Increased concurrency in todays computers, and the need for low-overhead algorithms to solve memory-intense challenges are motivating the development of algorithms like TRIP. The algorithm is based on recursive splitting of the matrix into sub-matrices, independent, parallel transposition of these sub-matrices, and subsequent combining of the results by a parallel, perfect shuffle. We prove correctness of the alg...
Abstract An adaptive parallel matrix transpose algorithm optimized for distrib-uted multicore archit...
Abstract. A style for programming problems from matrix algebra is developed with a familiar example ...
[[abstract]]Matrix operations are the core of many linear systems. Efficient matrix multiplication i...
This paper presents implementations of in‐place algorithms for transposing rectangular matrices. One...
This paper describes parallel matrix transpose algorithms on distributed memory concurrent processor...
We describe a decomposition for in-place matrix transposi-tion, with applications to Array of Struct...
International audienceModern computers keep following the traditional model of addressing memory lin...
A common operation in scientific computing is the multiplication of a sparse, rectangular or structu...
We consider the problem of matrix transpose on mesh-connected processor networks. On the theoretical...
Eklundh's (1972) algorithm to transpose a large matrix stored on an external device such as a disc h...
The mesh is an architecture that has many scientific applications, and matrix transpose is an import...
Transposing an N × N array that is distributed row- or column-wise across P = N processors is a fund...
The correctness of an in-place permutation algorithm is proved. The algorithm exchanges elements bel...
In this work, we present an approach to alleviate the potential benefit of adder graph algorithms by...
AbstractIn the last twenty-five years there has been much research into “fast” matrix multiplication...
Abstract An adaptive parallel matrix transpose algorithm optimized for distrib-uted multicore archit...
Abstract. A style for programming problems from matrix algebra is developed with a familiar example ...
[[abstract]]Matrix operations are the core of many linear systems. Efficient matrix multiplication i...
This paper presents implementations of in‐place algorithms for transposing rectangular matrices. One...
This paper describes parallel matrix transpose algorithms on distributed memory concurrent processor...
We describe a decomposition for in-place matrix transposi-tion, with applications to Array of Struct...
International audienceModern computers keep following the traditional model of addressing memory lin...
A common operation in scientific computing is the multiplication of a sparse, rectangular or structu...
We consider the problem of matrix transpose on mesh-connected processor networks. On the theoretical...
Eklundh's (1972) algorithm to transpose a large matrix stored on an external device such as a disc h...
The mesh is an architecture that has many scientific applications, and matrix transpose is an import...
Transposing an N × N array that is distributed row- or column-wise across P = N processors is a fund...
The correctness of an in-place permutation algorithm is proved. The algorithm exchanges elements bel...
In this work, we present an approach to alleviate the potential benefit of adder graph algorithms by...
AbstractIn the last twenty-five years there has been much research into “fast” matrix multiplication...
Abstract An adaptive parallel matrix transpose algorithm optimized for distrib-uted multicore archit...
Abstract. A style for programming problems from matrix algebra is developed with a familiar example ...
[[abstract]]Matrix operations are the core of many linear systems. Efficient matrix multiplication i...