Proceedings of: Third International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2016). Sofia (Bulgaria), October, 6-7, 2016.The work presented here is an experimental study of performance in execution time and energy consumption of matrix multiplications on a heterogeneous server. The server features three different devices: a multicore CPU, an NVIDIA Tesla GPU, and an Intel Xeon Phi coprocessor. Matrix multiplication is one of the most used linear algebra kernels and, consequently, applications that make an intensive use of this operation can greatly benefit from efficient implementations. This is the case of the evaluation of matrix polynomials, a core operation used to calculate many matrix functions, which involve a ...
In this paper, an adaptive matrix multiplication algorithm for dynamic heterogeneous environments is...
Dissertação de mestrado em Engenharia InformáticaMatrix algorithms often deal with large amounts of ...
In this document, we describe two strategies of distribution of computations that can be used to imp...
Proceedings of: Third International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2016...
(eng) In this paper, we address the issue of implementing matrix-matrix multiplication on heterogene...
International audienceIn this paper, we address the issue of implementing matrix-matrix multiplicati...
Computing a matrix polynomial is the basic process in the calculation of functions of matrices by th...
For the past decade, power/energy consumption has become a limiting factor for large-scale and embed...
For the past decade, power/energy consumption has become a limiting factor for large-scale and embed...
Parallel computing on networks of workstations are intensively used in some application areas such a...
Parallel computing on networks of workstations are intensively used in some application areas such a...
We present a new approach to utilizing all CPU cores and all GPUs on heterogeneous multicore and mul...
The use of auto-tuning techniques in a matrix multiplication routine for hybrid CPU+GPU platforms is...
A parallel matrix multiplication algorithm is presented, and studies of its performance and estimati...
As users and developers, we are witnessing the opening of a new computing scenario: the introduction...
In this paper, an adaptive matrix multiplication algorithm for dynamic heterogeneous environments is...
Dissertação de mestrado em Engenharia InformáticaMatrix algorithms often deal with large amounts of ...
In this document, we describe two strategies of distribution of computations that can be used to imp...
Proceedings of: Third International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2016...
(eng) In this paper, we address the issue of implementing matrix-matrix multiplication on heterogene...
International audienceIn this paper, we address the issue of implementing matrix-matrix multiplicati...
Computing a matrix polynomial is the basic process in the calculation of functions of matrices by th...
For the past decade, power/energy consumption has become a limiting factor for large-scale and embed...
For the past decade, power/energy consumption has become a limiting factor for large-scale and embed...
Parallel computing on networks of workstations are intensively used in some application areas such a...
Parallel computing on networks of workstations are intensively used in some application areas such a...
We present a new approach to utilizing all CPU cores and all GPUs on heterogeneous multicore and mul...
The use of auto-tuning techniques in a matrix multiplication routine for hybrid CPU+GPU platforms is...
A parallel matrix multiplication algorithm is presented, and studies of its performance and estimati...
As users and developers, we are witnessing the opening of a new computing scenario: the introduction...
In this paper, an adaptive matrix multiplication algorithm for dynamic heterogeneous environments is...
Dissertação de mestrado em Engenharia InformáticaMatrix algorithms often deal with large amounts of ...
In this document, we describe two strategies of distribution of computations that can be used to imp...