One of the greatest efforts of computational scientists is to translate the mathematical model describing a class of physical phenomena into large and complex codes. Many of these codes face the difficulty of implementing the mathematical operations in the model in terms of low level optimized kernels offering both performance and portability. Legacy codes suffer from the additional curse of rigid design choices based on outdated performance metrics (e.g. minimization of memory footprint). Using a representative code from the Materials Science community, we propose a methodology to restructure the most expensive operations in terms of an optimized combination of dense linear algebra (BLAS3) kernels. The resulting algorithm guarantees an inc...
Abstract. The use of highly optimized inner kernels is of paramount im-portance for obtaining effici...
This dissertation incorporates two research projects: performance modeling and prediction for dense ...
In this article we present a systematic approach to the derivation of families of high-performance a...
In this paper we focus on the integration of high-performance numerical libraries in ab initio codes...
In this paper we focus on the integration of high-performance numerical libraries in ab initio codes...
This dissertation sets out to improve performance—in terms of runtime as well as accuracy—of Materia...
Design by Transformation (DxT) is an approach to software development that encodes domain-specific p...
AbstractAs computing hardware evolves, increasing core counts mean that memory bandwidth is becoming...
The goal of the LAPACK project is to provide efficient and portable software for dense numerical lin...
AbstractDesign by Transformation (DxT) is an approach to software development that encodes domain-sp...
Abstract. Implementations of the Basic Linear Algebra Subprograms (BLAS) interface are major buildin...
In the early days of numerical simulations, advances were based on the ingenuity of pioneer scientis...
<p>Scientific Computation provides a critical role in the scientific process because it allows us as...
As computing hardware evolves, increasing core counts mean that memory bandwidth is becoming the dec...
Abstract. Dense linear algebra codes are often expressed and coded in terms of BLAS calls. This appr...
Abstract. The use of highly optimized inner kernels is of paramount im-portance for obtaining effici...
This dissertation incorporates two research projects: performance modeling and prediction for dense ...
In this article we present a systematic approach to the derivation of families of high-performance a...
In this paper we focus on the integration of high-performance numerical libraries in ab initio codes...
In this paper we focus on the integration of high-performance numerical libraries in ab initio codes...
This dissertation sets out to improve performance—in terms of runtime as well as accuracy—of Materia...
Design by Transformation (DxT) is an approach to software development that encodes domain-specific p...
AbstractAs computing hardware evolves, increasing core counts mean that memory bandwidth is becoming...
The goal of the LAPACK project is to provide efficient and portable software for dense numerical lin...
AbstractDesign by Transformation (DxT) is an approach to software development that encodes domain-sp...
Abstract. Implementations of the Basic Linear Algebra Subprograms (BLAS) interface are major buildin...
In the early days of numerical simulations, advances were based on the ingenuity of pioneer scientis...
<p>Scientific Computation provides a critical role in the scientific process because it allows us as...
As computing hardware evolves, increasing core counts mean that memory bandwidth is becoming the dec...
Abstract. Dense linear algebra codes are often expressed and coded in terms of BLAS calls. This appr...
Abstract. The use of highly optimized inner kernels is of paramount im-portance for obtaining effici...
This dissertation incorporates two research projects: performance modeling and prediction for dense ...
In this article we present a systematic approach to the derivation of families of high-performance a...