Abstract. We discuss an implementation of adaptive fast multipole meth-ods targeting hybrid multicore CPU- and GPU-systems. From previous ex-periences with the computational profile of our version of the fast multipole algorithm, suitable parts are off-loaded to the GPU, while the remaining parts are threaded and executed concurrently by the CPU. The parameters defin-ing the algorithm affects the performance and by measuring this effect we are able to dynamically balance the algorithm towards optimal performance. Our setup uses the dynamic nature of the computations and is therefore of general character. 1
Abstract—The Fast Multipole Method (FMM) is considered as one of the top ten algorithms of the 20th ...
Abstract—Graphics processing units (GPUs) brought huge performance improvements in the scientific an...
<b>Invited Lecture at the SIAM <i>"Encuentro Nacional de Ingeniería Matemática,"</i> at Pontificia U...
This paper presents an optimized CPU–GPU hybrid imple-mentation and a GPU performance model for the ...
This work presents the first extensive study of single- node performance optimization, tuning, and a...
International audienceLearn about the fast multipole method (FMM) and its optimization on NVIDIA GPU...
It has been shown that fast multipole methods can achieve good scalability on multi-core architectur...
The Fast Multipole Method allows the rapid evaluation of sums of radial basis functions centered at ...
We present efficient algorithms to build data structures and the lists needed for fast multipole met...
International audienceFast Multipole Methods are a fundamental operation for the simulation of many ...
We present parallel versions of a representative N-body application that uses Greengard and Rokhlin&...
The Fast Multipole Method allows the rapid evaluation of sums of radial basis functions centered at ...
Among the algorithms that are likely to play a major role in future exascale computing, the fast mul...
In the last two decades, physical constraints in chip design have spawned a paradigm shift in comput...
This paper presents two approaches to accelerate the MMD algorithm in multi-core environments. The M...
Abstract—The Fast Multipole Method (FMM) is considered as one of the top ten algorithms of the 20th ...
Abstract—Graphics processing units (GPUs) brought huge performance improvements in the scientific an...
<b>Invited Lecture at the SIAM <i>"Encuentro Nacional de Ingeniería Matemática,"</i> at Pontificia U...
This paper presents an optimized CPU–GPU hybrid imple-mentation and a GPU performance model for the ...
This work presents the first extensive study of single- node performance optimization, tuning, and a...
International audienceLearn about the fast multipole method (FMM) and its optimization on NVIDIA GPU...
It has been shown that fast multipole methods can achieve good scalability on multi-core architectur...
The Fast Multipole Method allows the rapid evaluation of sums of radial basis functions centered at ...
We present efficient algorithms to build data structures and the lists needed for fast multipole met...
International audienceFast Multipole Methods are a fundamental operation for the simulation of many ...
We present parallel versions of a representative N-body application that uses Greengard and Rokhlin&...
The Fast Multipole Method allows the rapid evaluation of sums of radial basis functions centered at ...
Among the algorithms that are likely to play a major role in future exascale computing, the fast mul...
In the last two decades, physical constraints in chip design have spawned a paradigm shift in comput...
This paper presents two approaches to accelerate the MMD algorithm in multi-core environments. The M...
Abstract—The Fast Multipole Method (FMM) is considered as one of the top ten algorithms of the 20th ...
Abstract—Graphics processing units (GPUs) brought huge performance improvements in the scientific an...
<b>Invited Lecture at the SIAM <i>"Encuentro Nacional de Ingeniería Matemática,"</i> at Pontificia U...