In this paper, we describe the performance of an $N$-body simulation of star cluster with 64k stars on a Cray XD1 system with 400 dual-core Opteron processors. A number of astrophysical $N$-body simulations were reported in SCxy conferences. All previous entries for Gordon-Bell prizes used at least 700k particles. The reason for this preference of large numbers of particles is the parallel efficiency. It is very difficult to achieve high performance on large parallel machines, if the number of particles is small. However, for many scientifically important problems the calculation cost scales as $O(N^{3.3})$, and it is very important to use large machines for relatively small number of particles. We achieved 2.03 Tflops, or 57.7% of the theo...
We describe in this paper an algorithm for solving the gravitational N-body problem using tree data ...
We describe source code level parallelization for the kira direct gravitational Nbody integrator, th...
We present our new parallel GPU clusters in Beijing and Heidelberg and demonstrate the nearly optima...
Direct-summation N-body algorithms compute the gravitational interaction between stars in an exact w...
We discuss the performance of direct summation codes used in the simulation of astrophysical stellar...
We present a performance analysis of different parallelization schemes for direct codes used in the ...
Direct-summation N-body algorithms compute the gravitational interaction between stars in an exact w...
We present preliminary results on the parallelization of a Tree-Code for evaluating gravitational fo...
In this whitepaper we report work that was done to investigate and improve the performance of a mixe...
I describe here the performances of a parallel treecode with individual particle timesteps. The code...
The main target of this work is the discussion of the modern techniques (software and hardware) apt ...
Although poor for small dynamic scales, the Particle-Mesh (PM) model allows in astrophysics good ins...
Most recent progress in understanding the dynamical evolution of star clusters relies on direct N-bo...
We report on resent N-body simulations of galaxy formation performed on the GRAPE-4 (GRAvity PipE) s...
The O(N) hierarchical N-body algorithms and mas-sively parallel processors allow particle systems of...
We describe in this paper an algorithm for solving the gravitational N-body problem using tree data ...
We describe source code level parallelization for the kira direct gravitational Nbody integrator, th...
We present our new parallel GPU clusters in Beijing and Heidelberg and demonstrate the nearly optima...
Direct-summation N-body algorithms compute the gravitational interaction between stars in an exact w...
We discuss the performance of direct summation codes used in the simulation of astrophysical stellar...
We present a performance analysis of different parallelization schemes for direct codes used in the ...
Direct-summation N-body algorithms compute the gravitational interaction between stars in an exact w...
We present preliminary results on the parallelization of a Tree-Code for evaluating gravitational fo...
In this whitepaper we report work that was done to investigate and improve the performance of a mixe...
I describe here the performances of a parallel treecode with individual particle timesteps. The code...
The main target of this work is the discussion of the modern techniques (software and hardware) apt ...
Although poor for small dynamic scales, the Particle-Mesh (PM) model allows in astrophysics good ins...
Most recent progress in understanding the dynamical evolution of star clusters relies on direct N-bo...
We report on resent N-body simulations of galaxy formation performed on the GRAPE-4 (GRAvity PipE) s...
The O(N) hierarchical N-body algorithms and mas-sively parallel processors allow particle systems of...
We describe in this paper an algorithm for solving the gravitational N-body problem using tree data ...
We describe source code level parallelization for the kira direct gravitational Nbody integrator, th...
We present our new parallel GPU clusters in Beijing and Heidelberg and demonstrate the nearly optima...