Effective Implementation of DGEMM on Modern Multicore CPU

Gepner, Pawel
Gamayunov, Victor
Fraser, David L.

Open link

Publication date

December 2012

DOI

10.1016/j.procs.2012.04.014

Publisher

Published by Elsevier B.V.

Abstract

AbstractIn this paper we will present a detailed study on tuning double-precision matrix-matrix multiplication (DGEMM) on the Intel Xeon E5-2680 CPU. We selected an optimal algorithm from the instruction set perspective as well software tools optimized for Intel Advance Vector Extensions (AVX). Our optimizations included the use of vector memory operations, and AVX instructions. Our proposed algorithm achieves a performance improvement of 33% compared to the latest results achieved using the Intel Math Kernel Library DGEMM subroutine

Extracted data

We use cookies to provide a better user experience.

Data Protection

Effective Implementation of DGEMM on Modern Multicore CPU

Abstract

Extracted data

Effective Implementation of DGEMM on Modern Multicore CPU

Abstract

Extracted data

Related items

Related items