AbstractIn this work the behavior of the multithreaded implementation of some LAPACK routines on PLASMA and Intel MKL is analyzed. The main goal is to develop a methodology for the installation and modelling of shared-memory linear algebra routines so that some decisions to reduce the execution time can be taken at running time. Typical decisions are: the number of threads to use, the block or tile size in algorithms by blocks or tiles, and the routine to use when there are several algorithms or implementations to solve the problem available. Experiments carried out with PLASMA and Intel MKL show that decisions can be taken automatically and satisfactory execution times are obtained
The promise of future many-core processors, with hundreds of threads running concurrently, has led t...
International audienceAs multicore systems continue to gain ground in the high performance computing...
This dissertation incorporates two research projects: performance modeling and prediction for dense ...
AbstractIn this work the behavior of the multithreaded implementation of some LAPACK routines on PLA...
The final publication is available at Springer via http://dx.doi.org/10.1007/s10766-013-0249-6The in...
AbstractThe introduction of auto-tuning techniques in linear algebra routines using hybrid combinati...
This paper discusses the scalability of Cholesky, LU, and QR factorization routines on MIMD distribu...
AbstractThe use of an OpenMP compiler optimized for the corresponding multicore system is a good opt...
Software overheads can be a significant cause of performance degradation in parallel numerical libra...
The high performance computing (HPC) community is obsessed over the general matrix-matrix multiply (...
Software overheads can be a significant cause of performance degradation in parallel numerical libra...
This paper discusses the design of linear algebra libraries for high performance computers. Particul...
This paper presents an overview of the LAPACK library, a portable, public-domain library to solve th...
This dissertation details contributions made by the author to the field of computer science while wo...
With the emergence of thread-level parallelism as the primary means for continued improvement of per...
The promise of future many-core processors, with hundreds of threads running concurrently, has led t...
International audienceAs multicore systems continue to gain ground in the high performance computing...
This dissertation incorporates two research projects: performance modeling and prediction for dense ...
AbstractIn this work the behavior of the multithreaded implementation of some LAPACK routines on PLA...
The final publication is available at Springer via http://dx.doi.org/10.1007/s10766-013-0249-6The in...
AbstractThe introduction of auto-tuning techniques in linear algebra routines using hybrid combinati...
This paper discusses the scalability of Cholesky, LU, and QR factorization routines on MIMD distribu...
AbstractThe use of an OpenMP compiler optimized for the corresponding multicore system is a good opt...
Software overheads can be a significant cause of performance degradation in parallel numerical libra...
The high performance computing (HPC) community is obsessed over the general matrix-matrix multiply (...
Software overheads can be a significant cause of performance degradation in parallel numerical libra...
This paper discusses the design of linear algebra libraries for high performance computers. Particul...
This paper presents an overview of the LAPACK library, a portable, public-domain library to solve th...
This dissertation details contributions made by the author to the field of computer science while wo...
With the emergence of thread-level parallelism as the primary means for continued improvement of per...
The promise of future many-core processors, with hundreds of threads running concurrently, has led t...
International audienceAs multicore systems continue to gain ground in the high performance computing...
This dissertation incorporates two research projects: performance modeling and prediction for dense ...