Opportunities for Parallelism in Matrix Multiplication FLAME Working Note #71

Tyler M. Smith
Robert A. Geijn
Mikhail Smelyanskiy
Jeff Hammond
Field G. Van Zee

Publication date

January 2013

Abstract

BLIS is a new framework for rapid instantiation of the BLAS. We describe how BLIS extends the “GotoBLAS approach ” to implementing matrix multiplication (gemm). While gemm was previously implemented as three loops around an inner kernel, BLIS exposes two additional loops within that inner kernel, casting the computation in terms of the BLIS micro-kernel so that porting gemm becomes a matter of customizing this micro-kernel for a given architecture. We discuss how this facilitates a finer level of parallelism that greatly simplifies the multithreading of gemm as well as additional opportunities for parallelizing multiple loops. Specifically, we show that with the advent of many-core architectures such as the IBM PowerPC A2 processor (used by...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Opportunities for Parallelism in Matrix Multiplication FLAME Working Note #71

Abstract

Extracted data

Opportunities for Parallelism in Matrix Multiplication FLAME Working Note #71

Abstract

Extracted data

Related items

Related items