The Opie Project aims to develop a compiler to transform C codes written for row-major matrix representation into equivalent codes for Morton-order matrix representation, and to apply its techniques to other languages. Accepting a possible reduction in performance we seek to compile a library of usable code to support future development of new algorithms better suited to Morton-ordered matrices. This paper reports the formalism behind the Opie compiler for C, its status: now compiling several standard Level-2 and Level-3 linear algebra operations, and a demonstration of a breakthrough reflected in a huge reduction of L1, L2, TLB misses. Overall perforamnce improves on the Intel Xeon architecture
International audienceSince the early beginning of computer history, one has needed programming lang...
Recently an algorithm has been developed for column reduction of polynomial matrices. In a previous ...
This dissertation focuses on the design and the implementation of domain-specific compilers for line...
A proof of concept is offered for the uniform representation of matrices serially in Morton-order (o...
This thesis describes novel techniques and test implementations for optimizing numerically intensive...
The goal of the LAPACK project is to provide efficient and portable software for dense numerical lin...
Abstract. Autotuning technology has emerged recently as a systematic process for evaluating alternat...
This work presents formal and practical tools to support the Alex paradigm for expressing and compi...
Modern microprocessors can achieve high performance on linear algebra kernels but this currently req...
Matrix-vector notation is the predominant idiom in which machine learning formulae are expressed; so...
Abstract—Scientific programmers often turn to vendor-tuned Basic Linear Algebra Subprograms (BLAS) t...
AbstractEfficient implementation of matrix algebra is important to the performance of many large and...
Modern microprocessors can achieve high performance on linear algebra kernels but this currently req...
Strassen's algorithm for matrix multiplication gains its lower arithmetic complexityatthe expe...
This paper examines how to write code to gain high performance on modern computers as well as the im...
International audienceSince the early beginning of computer history, one has needed programming lang...
Recently an algorithm has been developed for column reduction of polynomial matrices. In a previous ...
This dissertation focuses on the design and the implementation of domain-specific compilers for line...
A proof of concept is offered for the uniform representation of matrices serially in Morton-order (o...
This thesis describes novel techniques and test implementations for optimizing numerically intensive...
The goal of the LAPACK project is to provide efficient and portable software for dense numerical lin...
Abstract. Autotuning technology has emerged recently as a systematic process for evaluating alternat...
This work presents formal and practical tools to support the Alex paradigm for expressing and compi...
Modern microprocessors can achieve high performance on linear algebra kernels but this currently req...
Matrix-vector notation is the predominant idiom in which machine learning formulae are expressed; so...
Abstract—Scientific programmers often turn to vendor-tuned Basic Linear Algebra Subprograms (BLAS) t...
AbstractEfficient implementation of matrix algebra is important to the performance of many large and...
Modern microprocessors can achieve high performance on linear algebra kernels but this currently req...
Strassen's algorithm for matrix multiplication gains its lower arithmetic complexityatthe expe...
This paper examines how to write code to gain high performance on modern computers as well as the im...
International audienceSince the early beginning of computer history, one has needed programming lang...
Recently an algorithm has been developed for column reduction of polynomial matrices. In a previous ...
This dissertation focuses on the design and the implementation of domain-specific compilers for line...