Abstract—GPUs have been successfully used for acceleration of many mathematical functions and libraries. A common limitation of those libraries is the minimal size of primitives being handled, in order to achieve a significant speedup compared to their CPU versions. The minimal size requirement can prove prohibitive for many applications. It can be loosened by batching operations in order to have sufficient amount of data to perform the calculation maximally efficiently on the GPU. A fast OpenCL implementation of two basic vector functions – vector reduction and vector scaling – is described in this paper. Its performance is analyzed by running benchmarks on two of the most common GPUs in use – Tesla and Fermi GPUs from NVIDIA. Reported exp...
The present work is an analysis of the performance of the basic vector operations AXPY, DOT and SpMV...
OpenCL has been proposed as a means of accelerating functional computation using FPGA and GPU accele...
Latent Semantic Analysis (LSA) aims to reduce the dimensions of large term-document datasets using S...
Abstract. Implementations of the Basic Linear Algebra Subprograms (BLAS) interface are major buildin...
In this work, we evaluate OpenCL as a programming tool for developing performance-portable applicati...
We propose two high-level application programming interfaces (APIs) to use a graphics processing uni...
International audienceThe Simplex algorithm is a well known method to solve linear programming (LP) ...
We present several algorithms to compute the solution of a linear system of equa-tions on a GPU, as ...
International audienceDirect and iterative methods are often used to solve linear systems in enginee...
AbstractIn recent years, parallel processing has been widely used in the computer industry. Software...
Modern graphics processing units (GPUs) have been at the leading edge of in-creasing chip-level para...
ABSTRACT Sparse linear algebra is a cornerstone of modern computational science. These algorithms ig...
The trend of using co-processors as accelerators to perform certain tasks is rising in the parallel...
International audienceWe present VOBLA, a domain-specific language designed for programming linear a...
Parallel accelerators are playing an increasingly important role in scientific computing. However, i...
The present work is an analysis of the performance of the basic vector operations AXPY, DOT and SpMV...
OpenCL has been proposed as a means of accelerating functional computation using FPGA and GPU accele...
Latent Semantic Analysis (LSA) aims to reduce the dimensions of large term-document datasets using S...
Abstract. Implementations of the Basic Linear Algebra Subprograms (BLAS) interface are major buildin...
In this work, we evaluate OpenCL as a programming tool for developing performance-portable applicati...
We propose two high-level application programming interfaces (APIs) to use a graphics processing uni...
International audienceThe Simplex algorithm is a well known method to solve linear programming (LP) ...
We present several algorithms to compute the solution of a linear system of equa-tions on a GPU, as ...
International audienceDirect and iterative methods are often used to solve linear systems in enginee...
AbstractIn recent years, parallel processing has been widely used in the computer industry. Software...
Modern graphics processing units (GPUs) have been at the leading edge of in-creasing chip-level para...
ABSTRACT Sparse linear algebra is a cornerstone of modern computational science. These algorithms ig...
The trend of using co-processors as accelerators to perform certain tasks is rising in the parallel...
International audienceWe present VOBLA, a domain-specific language designed for programming linear a...
Parallel accelerators are playing an increasingly important role in scientific computing. However, i...
The present work is an analysis of the performance of the basic vector operations AXPY, DOT and SpMV...
OpenCL has been proposed as a means of accelerating functional computation using FPGA and GPU accele...
Latent Semantic Analysis (LSA) aims to reduce the dimensions of large term-document datasets using S...