International audienceIn order to implement a complete Fast Multipole Method on the Cell processor, we need an efficient complex matrix multiplication on each Synergistic Processing Element (SPE) of the Cell processor. Since the last IBM SDK does not provide such routine, we build our own one in single precision with C programming. We show that the complex matrix multiplication requires a specific computation scheme for the micro-kernel running on the SPE, and that a 32×32 tile is appropriate for close to peak performance computation as well as for communication overlapping. Our micro-kernel delivers 23.74 Gflop/s, which is 92.7% of the SPE peak performance, and we obtain up to 23.65 Gflop/s for one complete complex matrix product on one SP...
The slowing pace of commodity microprocessor performance improvements combined with ever-increasing ...
Today’s advanced research areas such as DNA computing, different branches of nanotechnology, immune ...
Proceedings of: Third International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2016...
Today’s computer systems develop towards less energy consumption while keeping high performance. The...
In this work, we examine the potential of using the recently-released STI Cell processor as a buildi...
The slowing pace of commodity microprocessor performance improvements combined with ever-increasing ...
ABSTRACT: In this paper, we have proposed one designs for matrix-matrix multiplication. The one desi...
Abstract. Traditional parallel programming methodologies for improv-ing performance assume cache-bas...
International audienceThis paper presents the first deployment of the Fast Multipole Method on the C...
Abstract. Moore’s Law suggests that the number of processing cores on a single chip increases expone...
International audienceThis paper proposes a micro-kernel to efficiently compute 4x4 8-bit matrix mul...
International audienceIn this paper, a new methodology for computing the Dense Matrix Vector Multipl...
Using super-resolution techniques to estimate the direction that a signal arrived at a radio receive...
As users and developers, we are witnessing the opening of a new computing scenario: the introduction...
The algorithm of multiplication of matrices of Dekel, Nassimi and Sahani or Hypercube is analysed, m...
The slowing pace of commodity microprocessor performance improvements combined with ever-increasing ...
Today’s advanced research areas such as DNA computing, different branches of nanotechnology, immune ...
Proceedings of: Third International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2016...
Today’s computer systems develop towards less energy consumption while keeping high performance. The...
In this work, we examine the potential of using the recently-released STI Cell processor as a buildi...
The slowing pace of commodity microprocessor performance improvements combined with ever-increasing ...
ABSTRACT: In this paper, we have proposed one designs for matrix-matrix multiplication. The one desi...
Abstract. Traditional parallel programming methodologies for improv-ing performance assume cache-bas...
International audienceThis paper presents the first deployment of the Fast Multipole Method on the C...
Abstract. Moore’s Law suggests that the number of processing cores on a single chip increases expone...
International audienceThis paper proposes a micro-kernel to efficiently compute 4x4 8-bit matrix mul...
International audienceIn this paper, a new methodology for computing the Dense Matrix Vector Multipl...
Using super-resolution techniques to estimate the direction that a signal arrived at a radio receive...
As users and developers, we are witnessing the opening of a new computing scenario: the introduction...
The algorithm of multiplication of matrices of Dekel, Nassimi and Sahani or Hypercube is analysed, m...
The slowing pace of commodity microprocessor performance improvements combined with ever-increasing ...
Today’s advanced research areas such as DNA computing, different branches of nanotechnology, immune ...
Proceedings of: Third International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2016...