We present a new algorithm for transposing sparse tensors called Quesadilla. The algorithm converts the sparse tensor data structure to a list of coordinates and sorts it with a fast multi-pass radix algorithm that exploits knowledge of the requested transposition and the tensors input partial coordinate ordering to provably minimize the number of parallel partial sorting passes. We evaluate both a serial and a parallel implementation of Quesadilla on a set of 19 tensors from the FROSTT collection, a set of tensors taken from scientific and data analytic applications. We compare Quesadilla and a generalization, Top-2-sadilla to several state of the art approaches, including the tensor transposition routine used in the SPLATT tensor factoriz...
Tensor algorithms are a rapidly growing field of research with applications in many scientific domai...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
The memory space taken to host and process large tensor graphs is a limiting factor for embedded Con...
International audienceThis paper formalizes the problem of reordering a sparse tensor to improve the...
© 2020 Owner/Author. This paper shows how to generate code that efficiently converts sparse tensors ...
This paper shows how to optimize sparse tensor algebraic expressions by introducing temporary tensor...
The Canonical Polyadic Decomposition (CPD) of tensors is a powerful tool for analyzing multi-wa...
Abstract—Multi-dimensional arrays, or tensors, are increas-ingly found in fields such as signal proc...
In this thesis, we apply a novel way of analyzing the MTTKRP algorithm. We look at multiple librarie...
International audience—We investigate an efficient parallelization of a class of algorithms for the ...
University of Minnesota Ph.D. dissertation. April 2019. Major: Computer Science. Advisor: George Ka...
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Com...
We investigate an efficient parallelization of a class of algorithms for the well-known Tucker decom...
We present in this paper a parallel algorithm that generates a low-rank approximation of a distribut...
Tensor CANDECOMP/PARAFAC (CP) decomposition has wide applications in statistical learning of latent ...
Tensor algorithms are a rapidly growing field of research with applications in many scientific domai...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
The memory space taken to host and process large tensor graphs is a limiting factor for embedded Con...
International audienceThis paper formalizes the problem of reordering a sparse tensor to improve the...
© 2020 Owner/Author. This paper shows how to generate code that efficiently converts sparse tensors ...
This paper shows how to optimize sparse tensor algebraic expressions by introducing temporary tensor...
The Canonical Polyadic Decomposition (CPD) of tensors is a powerful tool for analyzing multi-wa...
Abstract—Multi-dimensional arrays, or tensors, are increas-ingly found in fields such as signal proc...
In this thesis, we apply a novel way of analyzing the MTTKRP algorithm. We look at multiple librarie...
International audience—We investigate an efficient parallelization of a class of algorithms for the ...
University of Minnesota Ph.D. dissertation. April 2019. Major: Computer Science. Advisor: George Ka...
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Com...
We investigate an efficient parallelization of a class of algorithms for the well-known Tucker decom...
We present in this paper a parallel algorithm that generates a low-rank approximation of a distribut...
Tensor CANDECOMP/PARAFAC (CP) decomposition has wide applications in statistical learning of latent ...
Tensor algorithms are a rapidly growing field of research with applications in many scientific domai...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
The memory space taken to host and process large tensor graphs is a limiting factor for embedded Con...