Abstract. The use of highly optimized inner kernels is of paramount im-portance for obtaining efficient numerical algorithms. Often, such kernels are created by hand. In this paper, however, we present an alternative way to produce efficient matrix multiplication kernels based on a set of simple codes which can be parameterized at compilation time. Using the resulting kernels we have been able to produce high performance sparse and dense linear algebra codes on a variety of platforms.
Abstract—Scientific programmers often turn to vendor-tuned Basic Linear Algebra Subprograms (BLAS) t...
We have developed a framework based on relational algebra for compiling efficient sparse matrix cod...
One of the main obstacles to the efficient solution of scientific problems is the problem of tuning ...
The goal of the LAPACK project is to provide efficient and portable software for dense numerical lin...
Matrix computations lie at the heart of most scientific computational tasks. The solution of linear ...
This work is comprised of two different projects in numerical linear algebra. The first project is a...
The multiplication of a sparse matrix with a dense vector is a performance critical computational ke...
This thesis describes novel techniques and test implementations for optimizing numerically intensive...
Abstract. Dense linear algebra codes are often expressed and coded in terms of BLAS calls. This appr...
Abstract. Implementations of the Basic Linear Algebra Subprograms (BLAS) interface are major buildin...
Abstract. Sparse matrix-vector multiplication is an important computational kernel that tends to per...
A technique for optimizing software is proposed that involves the use of a standardized set of compu...
We present compiler technology for synthesizing sparse matrix code from (i) dense matrix code, and (...
Due to copyright restrictions, the access to the full text of this article is only available via sub...
In this paper, we deal with redistribution issues for dense linear algebra kernels on heterogeneous ...
Abstract—Scientific programmers often turn to vendor-tuned Basic Linear Algebra Subprograms (BLAS) t...
We have developed a framework based on relational algebra for compiling efficient sparse matrix cod...
One of the main obstacles to the efficient solution of scientific problems is the problem of tuning ...
The goal of the LAPACK project is to provide efficient and portable software for dense numerical lin...
Matrix computations lie at the heart of most scientific computational tasks. The solution of linear ...
This work is comprised of two different projects in numerical linear algebra. The first project is a...
The multiplication of a sparse matrix with a dense vector is a performance critical computational ke...
This thesis describes novel techniques and test implementations for optimizing numerically intensive...
Abstract. Dense linear algebra codes are often expressed and coded in terms of BLAS calls. This appr...
Abstract. Implementations of the Basic Linear Algebra Subprograms (BLAS) interface are major buildin...
Abstract. Sparse matrix-vector multiplication is an important computational kernel that tends to per...
A technique for optimizing software is proposed that involves the use of a standardized set of compu...
We present compiler technology for synthesizing sparse matrix code from (i) dense matrix code, and (...
Due to copyright restrictions, the access to the full text of this article is only available via sub...
In this paper, we deal with redistribution issues for dense linear algebra kernels on heterogeneous ...
Abstract—Scientific programmers often turn to vendor-tuned Basic Linear Algebra Subprograms (BLAS) t...
We have developed a framework based on relational algebra for compiling efficient sparse matrix cod...
One of the main obstacles to the efficient solution of scientific problems is the problem of tuning ...