This dataset contains the execution time of four BLAS Level 1 operations - ASUM, DOT, SCAL and AXPY - implemented using extended- and multiple-precision software for central processing units (CPUs) and CUDA compatible graphics processing units (GPUs). The experiments were conducted on an Intel Core i5-4590 processor and an NVIDIA Turing RTX 2060 graphics card. Each raw file provided contains the results of three test runs and details about the experimental setup. For each test run, the BLAS function was repeated ten times, and the total execution time of ten iterations (in milliseconds) was measured. For comparison purposes, the execution time of double precision routines from the OpenBLAS and cuBLAS packages is also presented. The comple...
Graphic processors are becoming faster and faster. Computational power within graphic processing uni...
Graphics processor units (GPUs) today can be used for computations that go beyond graphics and such...
Scientific computing applications often require support for non-traditional data types, for example,...
This dataset contains the execution time of four BLAS Level 1 operations - ASUM, DOT, SCAL and AXPY ...
This dataset contains the execution time of four BLAS Level 1 operations - ASUM, DOT, SCAL and AXPY ...
We provide timing results for common linear algebra subroutines across BLAS (Basic Lin-ear Algebra S...
BLAS benchmark results for: CPUs: Intel Core i7-4790K, Intel Core i5-4590, Intel Core i5-4590, Inte...
International audienceIn the last ten years, GPUs have dominated the market considering the computin...
This work reviews the experience of implementing different versions of the SSPR rank-one update oper...
Timing results for BLAS (Basic Linear Algebra Subprograms) libraries in R on diverse CPUs and GPUs. ...
National audienceDue to non-associativity of floating-point operations and dynamic scheduling on par...
This article describes the design rationale, a C implementation, and conformance testing of a subse...
A current trend in high-performance computing is to decompose a large linear algebra problem into ba...
This article describes the design rationale, a C implementation, and conformance testing of a subset...
In this paper we discuss about our experiences in improving the performance of two key algorithms: t...
Graphic processors are becoming faster and faster. Computational power within graphic processing uni...
Graphics processor units (GPUs) today can be used for computations that go beyond graphics and such...
Scientific computing applications often require support for non-traditional data types, for example,...
This dataset contains the execution time of four BLAS Level 1 operations - ASUM, DOT, SCAL and AXPY ...
This dataset contains the execution time of four BLAS Level 1 operations - ASUM, DOT, SCAL and AXPY ...
We provide timing results for common linear algebra subroutines across BLAS (Basic Lin-ear Algebra S...
BLAS benchmark results for: CPUs: Intel Core i7-4790K, Intel Core i5-4590, Intel Core i5-4590, Inte...
International audienceIn the last ten years, GPUs have dominated the market considering the computin...
This work reviews the experience of implementing different versions of the SSPR rank-one update oper...
Timing results for BLAS (Basic Linear Algebra Subprograms) libraries in R on diverse CPUs and GPUs. ...
National audienceDue to non-associativity of floating-point operations and dynamic scheduling on par...
This article describes the design rationale, a C implementation, and conformance testing of a subse...
A current trend in high-performance computing is to decompose a large linear algebra problem into ba...
This article describes the design rationale, a C implementation, and conformance testing of a subset...
In this paper we discuss about our experiences in improving the performance of two key algorithms: t...
Graphic processors are becoming faster and faster. Computational power within graphic processing uni...
Graphics processor units (GPUs) today can be used for computations that go beyond graphics and such...
Scientific computing applications often require support for non-traditional data types, for example,...