A tuned and scalable fast multipole method as a preeminent algorithm for exascale systems Rio Yokota and Lorena A Barba Among the algorithms that are likely to play a major role in future exascale computing, the fast multipole method (FMM) appears as a rising star. Our previous recent work showed scaling of an FMM on GPU clusters, with problem sizes of the order of billions of unknowns. That work led to an extremely parallel FMM, scaling to thousands of GPUs or tens of thousands of CPUs. This paper reports on a campaign of performance tuning and scalability studies using multi-core CPUs, on the Kraken supercomputer. All kernels in the FMM were parallelized using OpenMP, and a test using 107 particles ran-domly distributed in a cube showed 7...
Cataloged from PDF version of article.Due to its O(NlogN) complexity, the multilevel fast multipole ...
International audienceLearn about the fast multipole method (FMM) and its optimization on NVIDIA GPU...
<p>Illustration of the components in a fast multipole method (FMM), with the upward sweep depicted o...
A tuned and scalable fast multipole method as a preeminent algorithm for exascale systems Rio Yokota...
Among the algorithms that are likely to play a major role in future exascale computing, the fast mul...
<b>Invited Lecture at the SIAM <i>"Encuentro Nacional de Ingeniería Matemática,"</i> at Pontificia U...
Poster featured at the NVIDIA exhibit booth in the Supercomputing Conference, November 2011, Seattle...
The Fast Multipole Method allows the rapid evaluation of sums of radial basis functions centered at ...
The Fast Multipole Method allows the rapid evaluation of sums of radial basis functions centered at ...
This thesis investigates possible optimization on an efficient implementation of the multilevel fas...
With processor clock speeds having stagnated, parallel computing architectures have achieved a break...
This thesis presents a top to bottom analysis on designing and implementing fast algorithms for curr...
This work presents the first extensive study of single- node performance optimization, tuning, and a...
International audienceThis article describes how we manage to increase performance and to extend fea...
Nowadays, the most powerful supercomputers in the world, needed for solving complex models and simu...
Cataloged from PDF version of article.Due to its O(NlogN) complexity, the multilevel fast multipole ...
International audienceLearn about the fast multipole method (FMM) and its optimization on NVIDIA GPU...
<p>Illustration of the components in a fast multipole method (FMM), with the upward sweep depicted o...
A tuned and scalable fast multipole method as a preeminent algorithm for exascale systems Rio Yokota...
Among the algorithms that are likely to play a major role in future exascale computing, the fast mul...
<b>Invited Lecture at the SIAM <i>"Encuentro Nacional de Ingeniería Matemática,"</i> at Pontificia U...
Poster featured at the NVIDIA exhibit booth in the Supercomputing Conference, November 2011, Seattle...
The Fast Multipole Method allows the rapid evaluation of sums of radial basis functions centered at ...
The Fast Multipole Method allows the rapid evaluation of sums of radial basis functions centered at ...
This thesis investigates possible optimization on an efficient implementation of the multilevel fas...
With processor clock speeds having stagnated, parallel computing architectures have achieved a break...
This thesis presents a top to bottom analysis on designing and implementing fast algorithms for curr...
This work presents the first extensive study of single- node performance optimization, tuning, and a...
International audienceThis article describes how we manage to increase performance and to extend fea...
Nowadays, the most powerful supercomputers in the world, needed for solving complex models and simu...
Cataloged from PDF version of article.Due to its O(NlogN) complexity, the multilevel fast multipole ...
International audienceLearn about the fast multipole method (FMM) and its optimization on NVIDIA GPU...
<p>Illustration of the components in a fast multipole method (FMM), with the upward sweep depicted o...