Abstract—The paper presents results of several experiments evaluating the performance of NVIDIA processors, implementing a new Tesla architecture, in matrix-vector multiplication. Three matrix forms, dense, banded and sparse, are considered together with three hardware platforms: NVIDIA Tesla C870 computing board, NVIDIA GeForce 8800 GTX graphics card and one of the newest Intel Xeon processors, E5462, with 1.6 GHz front side bus speed. The conclusions from experiments indicate what speed-ups can be expected when, instead of standard CPUs, accelerators in the form of presented GPUs are used for considered computational kernels. I
This paper presents initial experiments in implementing two notable matrix multiplication algorithms...
Low-power, high-performance computing nowadays relies on accelerator cards to speed up the calculati...
Abstract—This paper presents a performance modeling and optimization analysis tool to predict and op...
Modern graphics processing units (GPUs) have been at the leading edge of in-creasing chip-level para...
Simulations are indispensable for engineering. They make it possible that one can perform fa...
This paper presents an integrated analytical and profile-based cross-architecture performance modeli...
The present work is an analysis of the performance of the basic vector operations AXPY, DOT and SpMV...
Today’s computer systems develop towards less energy consumption while keeping high performance. The...
Commodity clusters augmented with application accelerators are evolving as competitive high performa...
Matrix multiplication is at the core of high-performance numerical computation. Software methods of ...
In today's algorithms for sound localization techniques, matrix calculations are ubiquitous. Therefo...
The sparse Matrix-Vector multiplication is a key operation in science and engineering along with th...
To solve the computational complexity and time-consuming problem of large matrix multiplication, thi...
This paper discusses different approaches for computing the Walsh spectra on graphics processor unit...
NVIDIA have released a new platform (CUDA) for general purpose computing on their graphical processi...
This paper presents initial experiments in implementing two notable matrix multiplication algorithms...
Low-power, high-performance computing nowadays relies on accelerator cards to speed up the calculati...
Abstract—This paper presents a performance modeling and optimization analysis tool to predict and op...
Modern graphics processing units (GPUs) have been at the leading edge of in-creasing chip-level para...
Simulations are indispensable for engineering. They make it possible that one can perform fa...
This paper presents an integrated analytical and profile-based cross-architecture performance modeli...
The present work is an analysis of the performance of the basic vector operations AXPY, DOT and SpMV...
Today’s computer systems develop towards less energy consumption while keeping high performance. The...
Commodity clusters augmented with application accelerators are evolving as competitive high performa...
Matrix multiplication is at the core of high-performance numerical computation. Software methods of ...
In today's algorithms for sound localization techniques, matrix calculations are ubiquitous. Therefo...
The sparse Matrix-Vector multiplication is a key operation in science and engineering along with th...
To solve the computational complexity and time-consuming problem of large matrix multiplication, thi...
This paper discusses different approaches for computing the Walsh spectra on graphics processor unit...
NVIDIA have released a new platform (CUDA) for general purpose computing on their graphical processi...
This paper presents initial experiments in implementing two notable matrix multiplication algorithms...
Low-power, high-performance computing nowadays relies on accelerator cards to speed up the calculati...
Abstract—This paper presents a performance modeling and optimization analysis tool to predict and op...