High-Performance Matrix Multiplication: Hierarchical Data Structures, Optimized Kernel Routines, and Qualitative Performance Modeling

Wu, Wenhao

Open PDF

Open link

Publication date

August 2003

Publisher

Scholars Junction

Language

English

Abstract

The optimal implementation of matrix multiplication on modern computer architectures is of great importance for scientific and engineering applications. However, achieving the optimal performance for matrix multiplication has been continuously challenged both by the ever-widening performance gap between the processor and memory hierarchy and the introduction of new architectural features in modern architectures. The conventional way of dealing with these challenges benefits significantly from the blocking algorithm, which improves the data locality in the cache memory, and from the highly tuned inner kernel routines, which in turn exploit the architectural aspects on the specific processor to deliver near peak performance. A state-of-art im...

Extracted data

We use cookies to provide a better user experience.

Data Protection

High-Performance Matrix Multiplication: Hierarchical Data Structures, Optimized Kernel Routines, and Qualitative Performance Modeling

Abstract

Extracted data

High-Performance Matrix Multiplication: Hierarchical Data Structures, Optimized Kernel Routines, and Qualitative Performance Modeling

Abstract

Extracted data

Related items

Related items