Matrix multiplication may be considered as a model problem for analyzing the performance of more complex algorithms. On Cray and IBM computer systems, there are library routines which for this task operate at high megaflop rates. Other programs from numerical linear algebra do not always achieve this level of sophistication; e.g., they suffer from performance degradation caused by memory access conflicts. This effect has been studied considering the performance of subroutines for matrix multiplication on Cray X-MP, Cray Y-MP, and IBM 3090. Results are analyzed by means of simulation. It is shown that, on a Cray, a degradation of performance by bank conflicts may be reduced if the stride of references to memory is odd. It is demonstrated tha...
This Master Thesis examines if a matrix multiplication program that combines the two efficiency stra...
Abstract The Basic Linear Algebra Subprograms, BLAS, are the basic computa-tional kernels in most ap...
The imbalance between processor speed and memory access time is one characteristic issue of modern h...
Matrix multiplication (MM) is a computationally-intensive operation in many algorithms used in scien...
Abstract-- In this work, the performance of basic and strassen’s matrix multiplication algorithms ar...
During the last half-decade, a number of research efforts have centered around developing software f...
Some level-2 and level-3 Distributed Basic Linear Algebra Subroutines (DBLAS) that have been impleme...
Parallel computing on networks of workstations are intensively used in some application areas such a...
The performance of a parallel matrix-matrix-multiplication routine with the same functionality as DG...
Strassen's algorithm for matrix multiplication gains its lower arithmetic complexityatthe expe...
AbstractMathematical software libraries represent collections of subprograms for the solution of fre...
The paper presents analysis of matrix multiplication algorithms from the point of view of their effi...
Submitted for publication to IEEE TPDS The performance of both serial and parallel implementations o...
Matrix multiplication (hereafter we use the acronym MM) is among the most fundamental operations of ...
Memory contention can be a major source of overhead in large-scale shared-memory multiprocessors. Al...
This Master Thesis examines if a matrix multiplication program that combines the two efficiency stra...
Abstract The Basic Linear Algebra Subprograms, BLAS, are the basic computa-tional kernels in most ap...
The imbalance between processor speed and memory access time is one characteristic issue of modern h...
Matrix multiplication (MM) is a computationally-intensive operation in many algorithms used in scien...
Abstract-- In this work, the performance of basic and strassen’s matrix multiplication algorithms ar...
During the last half-decade, a number of research efforts have centered around developing software f...
Some level-2 and level-3 Distributed Basic Linear Algebra Subroutines (DBLAS) that have been impleme...
Parallel computing on networks of workstations are intensively used in some application areas such a...
The performance of a parallel matrix-matrix-multiplication routine with the same functionality as DG...
Strassen's algorithm for matrix multiplication gains its lower arithmetic complexityatthe expe...
AbstractMathematical software libraries represent collections of subprograms for the solution of fre...
The paper presents analysis of matrix multiplication algorithms from the point of view of their effi...
Submitted for publication to IEEE TPDS The performance of both serial and parallel implementations o...
Matrix multiplication (hereafter we use the acronym MM) is among the most fundamental operations of ...
Memory contention can be a major source of overhead in large-scale shared-memory multiprocessors. Al...
This Master Thesis examines if a matrix multiplication program that combines the two efficiency stra...
Abstract The Basic Linear Algebra Subprograms, BLAS, are the basic computa-tional kernels in most ap...
The imbalance between processor speed and memory access time is one characteristic issue of modern h...