Proceedings of: Third International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2016). Sofia (Bulgaria), October, 6-7, 2016.The work presented here is an experimental study of performance in execution time and energy consumption of matrix multiplications on a heterogeneous server. The server features three different devices: a multicore CPU, an NVIDIA Tesla GPU, and an Intel Xeon Phi coprocessor. Matrix multiplication is one of the most used linear algebra kernels and, consequently, applications that make an intensive use of this operation can greatly benefit from efficient implementations. This is the case of the evaluation of matrix polynomials, a core operation used to calculate many matrix functions, which involve a ...
We present a new approach to utilizing all CPU cores and all GPUs on heterogeneous multicore and mul...
In this thesis, the performance and energy efficiency of four different implementations of matrix mu...
In this document, we describe two strategies of distribution of computations that can be used to imp...
Proceedings of: Third International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2016...
(eng) In this paper, we address the issue of implementing matrix-matrix multiplication on heterogene...
For the past decade, power/energy consumption has become a limiting factor for large-scale and embed...
For the past decade, power/energy consumption has become a limiting factor for large-scale and embed...
International audienceIn this paper, we address the issue of implementing matrix-matrix multiplicati...
Parallel computing on networks of workstations are intensively used in some application areas such a...
Dissertação de mestrado em Engenharia InformáticaMatrix algorithms often deal with large amounts of ...
Matrix multiplication is at the core of high-performance numerical computation. Software methods of ...
Computing a matrix polynomial is the basic process in the calculation of functions of matrices by th...
General matrix-matrix multiplications (GEMM) in vendor-supplied BLAS libraries are best optimized fo...
Parallel computing on networks of workstations are intensively used in some application areas such a...
Matrix multiplication is taken as a test bed for parallel processing on heterogeneous networks of wo...
We present a new approach to utilizing all CPU cores and all GPUs on heterogeneous multicore and mul...
In this thesis, the performance and energy efficiency of four different implementations of matrix mu...
In this document, we describe two strategies of distribution of computations that can be used to imp...
Proceedings of: Third International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2016...
(eng) In this paper, we address the issue of implementing matrix-matrix multiplication on heterogene...
For the past decade, power/energy consumption has become a limiting factor for large-scale and embed...
For the past decade, power/energy consumption has become a limiting factor for large-scale and embed...
International audienceIn this paper, we address the issue of implementing matrix-matrix multiplicati...
Parallel computing on networks of workstations are intensively used in some application areas such a...
Dissertação de mestrado em Engenharia InformáticaMatrix algorithms often deal with large amounts of ...
Matrix multiplication is at the core of high-performance numerical computation. Software methods of ...
Computing a matrix polynomial is the basic process in the calculation of functions of matrices by th...
General matrix-matrix multiplications (GEMM) in vendor-supplied BLAS libraries are best optimized fo...
Parallel computing on networks of workstations are intensively used in some application areas such a...
Matrix multiplication is taken as a test bed for parallel processing on heterogeneous networks of wo...
We present a new approach to utilizing all CPU cores and all GPUs on heterogeneous multicore and mul...
In this thesis, the performance and energy efficiency of four different implementations of matrix mu...
In this document, we describe two strategies of distribution of computations that can be used to imp...