Graphics hardware's performance is advancing much faster than the performance of conventional microprocessor. In order to utilize the tremendous computing power of these systems, it is critical to tune software to graphics hardware's architectural features. The frequent changes in GPUs' architecture and performance characteristics make it very desirable for such tuning to be automated. This paper implements an automatic tuning system to generate high-performance matrix-multiplication implementation on graphics hardware. The automatic tuning system uses a parameterized code generator to generate multiple versions of matrix multiplication, whose performances are empirically evaluated by actual execution on the target platform. An ad-hoc searc...
Matrix multiplication is at the core of high-performance numerical computation. Software methods of ...
We propose and evaluate a novel strategy for tuning the performance of a class of stencil computatio...
This is the Accepted Manuscript version of the following article: V. Kelefouras, A Kritikakou I. Mpo...
Graphics hardware’s performance is advancing much faster than the performance of conventional microp...
Graphics hardware's performance is advancing much faster than the performance of conventional microp...
In order to utilize the tremendous computing power of grpahics hardware and to automatically adapt t...
As computer architectures become more complex, the task of writing efficient program to best utilize...
The development of high performance dense linear algebra (DLA) critically depends on highly optimize...
Graphics Processing Units (GPUs) have revolutionized the computing landscape in the past decade and ...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
The use of auto-tuning techniques in a matrix multiplication routine for hybrid CPU+GPU platforms i...
In high-performance computing, excellent node-level performance is required for the efficient use of...
As computer architectures become more complex, the task of writing efficient program to best utilize...
AbstractThe introduction of auto-tuning techniques in linear algebra routines using hybrid combinati...
Matrix multiplication is at the core of high-performance numerical computation. Software methods of ...
We propose and evaluate a novel strategy for tuning the performance of a class of stencil computatio...
This is the Accepted Manuscript version of the following article: V. Kelefouras, A Kritikakou I. Mpo...
Graphics hardware’s performance is advancing much faster than the performance of conventional microp...
Graphics hardware's performance is advancing much faster than the performance of conventional microp...
In order to utilize the tremendous computing power of grpahics hardware and to automatically adapt t...
As computer architectures become more complex, the task of writing efficient program to best utilize...
The development of high performance dense linear algebra (DLA) critically depends on highly optimize...
Graphics Processing Units (GPUs) have revolutionized the computing landscape in the past decade and ...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
Graphics Processing Units (GPUs) have revolutionized the HPC landscape. The first generation of exas...
The use of auto-tuning techniques in a matrix multiplication routine for hybrid CPU+GPU platforms i...
In high-performance computing, excellent node-level performance is required for the efficient use of...
As computer architectures become more complex, the task of writing efficient program to best utilize...
AbstractThe introduction of auto-tuning techniques in linear algebra routines using hybrid combinati...
Matrix multiplication is at the core of high-performance numerical computation. Software methods of ...
We propose and evaluate a novel strategy for tuning the performance of a class of stencil computatio...
This is the Accepted Manuscript version of the following article: V. Kelefouras, A Kritikakou I. Mpo...