This dissertation introduces measurement-based performance modeling and prediction techniques for dense linear algebra algorithms. As a core principle, these techniques avoid executions of such algorithms entirely, and instead predict their performance through runtime estimates for the underlying compute kernels. For a variety of operations, these predictions allow to quickly select the fastest algorithm configurations from available alternatives. We consider two scenarios that cover a wide range of computations: To predict the performance of blocked algorithms, we design algorithm-independent performance models for kernel operations that are generated automatically once per platform. For various matrix operations, instantaneous predictions...
This paper addresses the efficient exploitation of task-level parallelism, present in many dense lin...
We address some key issues in designing dense linear algebra (DLA) algorithms that are common for bo...
textOver the last two decades, much progress has been made in the area of the high-performance sequ...
This dissertation introduces measurement-based performance modeling and prediction techniques for de...
This dissertation incorporates two research projects: performance modeling and prediction for dense ...
Abstract—It is well known that the behavior of dense linear algebra algorithms is greatly influenced...
Abstract. Implementations of the Basic Linear Algebra Subprograms (BLAS) interface are major buildin...
International audienceWe present a method for automatically selecting optimal implementations of spa...
In this article we present a systematic approach to the derivation of families of high-performance a...
This report has been developed over the work done in the deliverable [Nava94] There it was shown tha...
Expressions that involve matrices and vectors, known as linear algebra expressions, are commonly eva...
Application performance dominated by a few computational kernels Performance tuning today Vendor-tun...
Expressions that involve matrices and vectors, known as linear algebra expressions, are commonly eva...
Dense linear algebra computations are essential to nearly every problem in scientific computing and ...
Abstract. We address some key issues in designing dense linear algebra (DLA) algorithms that are com...
This paper addresses the efficient exploitation of task-level parallelism, present in many dense lin...
We address some key issues in designing dense linear algebra (DLA) algorithms that are common for bo...
textOver the last two decades, much progress has been made in the area of the high-performance sequ...
This dissertation introduces measurement-based performance modeling and prediction techniques for de...
This dissertation incorporates two research projects: performance modeling and prediction for dense ...
Abstract—It is well known that the behavior of dense linear algebra algorithms is greatly influenced...
Abstract. Implementations of the Basic Linear Algebra Subprograms (BLAS) interface are major buildin...
International audienceWe present a method for automatically selecting optimal implementations of spa...
In this article we present a systematic approach to the derivation of families of high-performance a...
This report has been developed over the work done in the deliverable [Nava94] There it was shown tha...
Expressions that involve matrices and vectors, known as linear algebra expressions, are commonly eva...
Application performance dominated by a few computational kernels Performance tuning today Vendor-tun...
Expressions that involve matrices and vectors, known as linear algebra expressions, are commonly eva...
Dense linear algebra computations are essential to nearly every problem in scientific computing and ...
Abstract. We address some key issues in designing dense linear algebra (DLA) algorithms that are com...
This paper addresses the efficient exploitation of task-level parallelism, present in many dense lin...
We address some key issues in designing dense linear algebra (DLA) algorithms that are common for bo...
textOver the last two decades, much progress has been made in the area of the high-performance sequ...