Dissertação de mestrado em Engenharia InformáticaMatrix algorithms often deal with large amounts of data at a time, which impairs efficient cache memory usage. Recent collaborative work between the Numerical Algorithms Group and the University of Minho led to a blocked approach to the matrix square root algorithm with significant efficiency improvements, particularly in a multicore shared memory environment. Distributed memory architectures were left unexplored. In these systems data is distributed across multiple memory spaces, including those associated with specialized accelerator devices, such as GPUs. Systems with these devices are known as heterogeneous platforms. This dissertation focuses on studying the blocked matrix squar...
AbstractOne-sided dense matrix factorizations are important computational kernels in many scientific...
Matrix multiplication is at the core of high-performance numerical computation. Software methods of ...
Producción CientíficaSupercomputers are becoming more heterogeneous. They are composed by several ma...
Proceedings of: Third International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2016...
Dissertação de mestrado em Computer ScienceCurrently, most computing systems have access to more tha...
As Central Processing Units (CPUs) and Graphical Processing Units (GPUs) get progressively better, d...
We present a new approach to utilizing all CPU cores and all GPUs on heterogeneous multicore and mul...
Este trabalho desenvolve dois algoritmos para decomposição de multiplicação matricial geral (GEMM, d...
The Graphics Processing Unit (GPU) is present in almost every modern day personal computer. Despite...
This paper presents and analyzes a heterogeneous implementation of an industrial use case based on K...
The sparse Matrix-Vector multiplication is a key operation in science and engineering along with th...
In a previous PPoPP paper we showed how the FLAME method-ology, combined with the SuperMatrix runtim...
Abstract: Few realize that, for large matrices, many dense matrix computations achieve nearly the sa...
Few realize that, for large matrices, many dense matrix computations achieve nearly the same perform...
International audienceNowadays GPUs have dominated the market considering the computing/power metric...
AbstractOne-sided dense matrix factorizations are important computational kernels in many scientific...
Matrix multiplication is at the core of high-performance numerical computation. Software methods of ...
Producción CientíficaSupercomputers are becoming more heterogeneous. They are composed by several ma...
Proceedings of: Third International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2016...
Dissertação de mestrado em Computer ScienceCurrently, most computing systems have access to more tha...
As Central Processing Units (CPUs) and Graphical Processing Units (GPUs) get progressively better, d...
We present a new approach to utilizing all CPU cores and all GPUs on heterogeneous multicore and mul...
Este trabalho desenvolve dois algoritmos para decomposição de multiplicação matricial geral (GEMM, d...
The Graphics Processing Unit (GPU) is present in almost every modern day personal computer. Despite...
This paper presents and analyzes a heterogeneous implementation of an industrial use case based on K...
The sparse Matrix-Vector multiplication is a key operation in science and engineering along with th...
In a previous PPoPP paper we showed how the FLAME method-ology, combined with the SuperMatrix runtim...
Abstract: Few realize that, for large matrices, many dense matrix computations achieve nearly the sa...
Few realize that, for large matrices, many dense matrix computations achieve nearly the same perform...
International audienceNowadays GPUs have dominated the market considering the computing/power metric...
AbstractOne-sided dense matrix factorizations are important computational kernels in many scientific...
Matrix multiplication is at the core of high-performance numerical computation. Software methods of ...
Producción CientíficaSupercomputers are becoming more heterogeneous. They are composed by several ma...