Although there exist several approaches to rapidly solving the N-body problem, and a diversity of implementation strategies, the performance tradeoffs of the various strategies with respect to problem-specific data distributions is poorly understood on a parallel computer. We present a synthetic workload model and a simulator that enables us to evaluate the performance tradeoffs encountered in implementing particle methods on MIMD computers. These results can be used to evaluate designs early on in the implementation process. 1 Introduction We present a comparative performance analysis of various strategies for implementing localized N-body solvers on MIMD distributed memory parallel computers. In localized N-body solvers, particle interac...
We discuss the performance of direct summation codes used in the simulation of astrophysical stellar...
Many physical models require the simulation of a large number ($N$) of particles interacting throug...
We describe the design of several portable and efficient parallel implementations of adaptive N-body...
This work considers the organization and performance of computations on parallel computers of tree...
O(N) algorithms for N-body simulations enable the simulation of particle systems with up to 100 mill...
Direct-summation N-body algorithms compute the gravitational interaction between stars in an exact w...
The O(N) hierarchical N-body algorithms and mas-sively parallel processors allow particle systems of...
The O(N) hierarchical N-body algorithms and Massively Parallel Processors allow particle systems of ...
We report the design and performance of a computational molecular dynamics (MD) code for 400 million...
This paper discusses the implementation of particle based numerical methods on multi-core machines. ...
We present a performance analysis of different parallelization schemes for direct codes used in the ...
Implementations for molecular dynamics on parallel computers generally use either particle paralleli...
The O(N) hierarchical N–body algorithms and Massively Parallel Processors allow particle systems of ...
Parallel computer programs are used to speed up the calculation of computationally-demanding scienti...
In this paper, we present two new parallel formulations of the Barnes-Hut method. These parallel for...
We discuss the performance of direct summation codes used in the simulation of astrophysical stellar...
Many physical models require the simulation of a large number ($N$) of particles interacting throug...
We describe the design of several portable and efficient parallel implementations of adaptive N-body...
This work considers the organization and performance of computations on parallel computers of tree...
O(N) algorithms for N-body simulations enable the simulation of particle systems with up to 100 mill...
Direct-summation N-body algorithms compute the gravitational interaction between stars in an exact w...
The O(N) hierarchical N-body algorithms and mas-sively parallel processors allow particle systems of...
The O(N) hierarchical N-body algorithms and Massively Parallel Processors allow particle systems of ...
We report the design and performance of a computational molecular dynamics (MD) code for 400 million...
This paper discusses the implementation of particle based numerical methods on multi-core machines. ...
We present a performance analysis of different parallelization schemes for direct codes used in the ...
Implementations for molecular dynamics on parallel computers generally use either particle paralleli...
The O(N) hierarchical N–body algorithms and Massively Parallel Processors allow particle systems of ...
Parallel computer programs are used to speed up the calculation of computationally-demanding scienti...
In this paper, we present two new parallel formulations of the Barnes-Hut method. These parallel for...
We discuss the performance of direct summation codes used in the simulation of astrophysical stellar...
Many physical models require the simulation of a large number ($N$) of particles interacting throug...
We describe the design of several portable and efficient parallel implementations of adaptive N-body...