The integration of the equations of motion of N interacting particles, represents a classical problem in many branches of physics and chemistry. The direct N-body problem is at the heart of simulations studying Coulomb Crystals. We present an hand-optimized code for the latest AVX-512 set of instructions that achieve a single core speed up of $\approx 340\%$ respect the version optimized by the compiler. The increase performance is due a optimization on the organization of the memory access on the inner loop on the Coulomb and, specially, on the usage of an intrinsic function to faster compute the $1/\sqrt{x}$. Our parallelization, which is implemented in OpenMP, achieves an excellent scalability with the number of cores. In total, we achie...
This work considers the organization and performance of computations on parallel computers of tree...
We introduce a new class of parallel algorithms for the exact computation of systems with pairwise m...
International audienceModern parallel architectures require applications to generate massive paralle...
O(N) algorithms for N-body simulations enable the simulation of particle systems with up to 100 mill...
The main performance bottleneck of gravitational N-body codes is the force calculation between two p...
Direct-summation N-body algorithms compute the gravitational interaction between stars in an exact w...
The N-body simulations have become a powerful tool to test the gravitational interaction among parti...
The present work attempts to integrate the independent efforts in the fast N-body commu-nity to crea...
Algorithms designed to efficiently solve this classical problem of physics fit very well on GPU hard...
We present a new implementation of the numerical integration of the classical, gravitational, N-body...
We present new analysis, algorithmic techniques, and implementations of the Fast Multipole Method (F...
We provide a novel and efficient algorithm for computing accelerations in theperiodic large-N-body p...
General N-body problems are a set of problems in which an update to a single element in the system d...
Abstract—N-body simulations are computation-intensive ap-plications that calculate the motion of a l...
N-Body simulation simulates the evolution of a system that is composed of N particles, where each el...
This work considers the organization and performance of computations on parallel computers of tree...
We introduce a new class of parallel algorithms for the exact computation of systems with pairwise m...
International audienceModern parallel architectures require applications to generate massive paralle...
O(N) algorithms for N-body simulations enable the simulation of particle systems with up to 100 mill...
The main performance bottleneck of gravitational N-body codes is the force calculation between two p...
Direct-summation N-body algorithms compute the gravitational interaction between stars in an exact w...
The N-body simulations have become a powerful tool to test the gravitational interaction among parti...
The present work attempts to integrate the independent efforts in the fast N-body commu-nity to crea...
Algorithms designed to efficiently solve this classical problem of physics fit very well on GPU hard...
We present a new implementation of the numerical integration of the classical, gravitational, N-body...
We present new analysis, algorithmic techniques, and implementations of the Fast Multipole Method (F...
We provide a novel and efficient algorithm for computing accelerations in theperiodic large-N-body p...
General N-body problems are a set of problems in which an update to a single element in the system d...
Abstract—N-body simulations are computation-intensive ap-plications that calculate the motion of a l...
N-Body simulation simulates the evolution of a system that is composed of N particles, where each el...
This work considers the organization and performance of computations on parallel computers of tree...
We introduce a new class of parallel algorithms for the exact computation of systems with pairwise m...
International audienceModern parallel architectures require applications to generate massive paralle...