The Fast Multipole Method allows the rapid evaluation of sums of radial basis functions centered at points distributed inside a computational domain at a large number of evaluation points to a specified accuracy $\epsilon$. The method scales as $O (N)$ in both time and memory compared to the direct method with complexity $O(N^2)$, which allows the solution of larger problems with given resources. Graphical processing units (GPU) are now increasingly viewed as data parallel compute coprocessors that can provide significant computational performance at low price. We describe acceleration of the FMM using the data parallel GPU architecture. The FMM has a complex hierarchical (adaptive) structure, which is not easily implemented on data par...
This thesis presents a top to bottom analysis on designing and implementing fast algorithms for curr...
We have implemented the fast multipole method (FMM) on a special-purpose computer GRAPE (GRAvity piP...
We present parallel versions of a representative N-body application that uses Greengard and Rokhlin&...
The Fast Multipole Method allows the rapid evaluation of sums of radial basis functions centered at ...
<b>Invited Lecture at the SIAM <i>"Encuentro Nacional de Ingeniería Matemática,"</i> at Pontificia U...
Among the algorithms that are likely to play a major role in future exascale computing, the fast mul...
We present efficient algorithms to build data structures and the lists needed for fast multipole met...
International audienceLearn about the fast multipole method (FMM) and its optimization on NVIDIA GPU...
Computing on graphics processors is maybe one of the most important developments in computational sc...
The N-body problem appears in many computational physics simulations. At each time step the computat...
Solving an N-body problem, electrostatic or gravitational, is a crucial task and the main computatio...
This work presents the first extensive study of single- node performance optimization, tuning, and a...
This paper presents an optimized CPU–GPU hybrid imple-mentation and a GPU performance model for the ...
A tuned and scalable fast multipole method as a preeminent algorithm for exascale systems Rio Yokota...
International audienceFast Multipole Methods are a fundamental operation for the simulation of many ...
This thesis presents a top to bottom analysis on designing and implementing fast algorithms for curr...
We have implemented the fast multipole method (FMM) on a special-purpose computer GRAPE (GRAvity piP...
We present parallel versions of a representative N-body application that uses Greengard and Rokhlin&...
The Fast Multipole Method allows the rapid evaluation of sums of radial basis functions centered at ...
<b>Invited Lecture at the SIAM <i>"Encuentro Nacional de Ingeniería Matemática,"</i> at Pontificia U...
Among the algorithms that are likely to play a major role in future exascale computing, the fast mul...
We present efficient algorithms to build data structures and the lists needed for fast multipole met...
International audienceLearn about the fast multipole method (FMM) and its optimization on NVIDIA GPU...
Computing on graphics processors is maybe one of the most important developments in computational sc...
The N-body problem appears in many computational physics simulations. At each time step the computat...
Solving an N-body problem, electrostatic or gravitational, is a crucial task and the main computatio...
This work presents the first extensive study of single- node performance optimization, tuning, and a...
This paper presents an optimized CPU–GPU hybrid imple-mentation and a GPU performance model for the ...
A tuned and scalable fast multipole method as a preeminent algorithm for exascale systems Rio Yokota...
International audienceFast Multipole Methods are a fundamental operation for the simulation of many ...
This thesis presents a top to bottom analysis on designing and implementing fast algorithms for curr...
We have implemented the fast multipole method (FMM) on a special-purpose computer GRAPE (GRAvity piP...
We present parallel versions of a representative N-body application that uses Greengard and Rokhlin&...