This report summarises the main points raised on a recent workshop discussing various extensions to the BLAS standard, held at the University of Tennessee in May 2016. In particular the discussions focused on batched, reproducible, and reduced precision BLAS extensions. Various members of the linear algebra community and representatives from industry were present to generate and evaluate ideas in each of these areas
One trend in modern high performance computing (HPC) is to decompose a large linear algebra problem ...
Basic Linear Algebra Subprograms (BLAS) are building blocks for many other matrix computations BLAS ...
This paper describes a C implementation of the proposed new BLAS Standard. Permitting mixtures of i...
This report summarises the main points raised on a recent workshop discussing various extensions to ...
This article describes the design rationale, a C implementation, and conformance testing of a subse...
This article describes the design rationale, a C implementation, and conformance testing of a subset...
This paper summarizes the BLAS Technical Forum Standard, a speci- #cation of a set of kernel routine...
A current trend in high-performance computing is to decompose a large linear algebra problem into ba...
The BLAS-like Library Instantiation Software (BLIS) is a framework for the rapid instantiation of ba...
The BLAS-like Library Instantiation Software (BLIS) is a framework for the rapid instantiation of ba...
This paper proposes an API for Batched Basic Linear Algebra Subprograms (Batched BLAS). We focus on...
National audienceDue to non-associativity of floating-point operations and dynamic scheduling on par...
This working note examines different Fortran implementations of the Basic Linear Algebra Subprograms...
We provide timing results for common linear algebra subroutines across BLAS (Basic Lin-ear Algebra S...
BLIS is a new software framework for instantiating high-performance BLAS-like dense linear algebra l...
One trend in modern high performance computing (HPC) is to decompose a large linear algebra problem ...
Basic Linear Algebra Subprograms (BLAS) are building blocks for many other matrix computations BLAS ...
This paper describes a C implementation of the proposed new BLAS Standard. Permitting mixtures of i...
This report summarises the main points raised on a recent workshop discussing various extensions to ...
This article describes the design rationale, a C implementation, and conformance testing of a subse...
This article describes the design rationale, a C implementation, and conformance testing of a subset...
This paper summarizes the BLAS Technical Forum Standard, a speci- #cation of a set of kernel routine...
A current trend in high-performance computing is to decompose a large linear algebra problem into ba...
The BLAS-like Library Instantiation Software (BLIS) is a framework for the rapid instantiation of ba...
The BLAS-like Library Instantiation Software (BLIS) is a framework for the rapid instantiation of ba...
This paper proposes an API for Batched Basic Linear Algebra Subprograms (Batched BLAS). We focus on...
National audienceDue to non-associativity of floating-point operations and dynamic scheduling on par...
This working note examines different Fortran implementations of the Basic Linear Algebra Subprograms...
We provide timing results for common linear algebra subroutines across BLAS (Basic Lin-ear Algebra S...
BLIS is a new software framework for instantiating high-performance BLAS-like dense linear algebra l...
One trend in modern high performance computing (HPC) is to decompose a large linear algebra problem ...
Basic Linear Algebra Subprograms (BLAS) are building blocks for many other matrix computations BLAS ...
This paper describes a C implementation of the proposed new BLAS Standard. Permitting mixtures of i...