Abstract—Conventional programming practices on multicore processors in high performance computing architectures are not universally effective in terms of efficiency and scalability for many algorithms in scientific computing. One possible solution for improving efficiency and scalability in applications on this class of machines is the use of a many-tasking runtime system employing many lightweight, concurrent threads. Yet a priori estimation of the potential performance and scalability impact of such runtime systems on existing applications developed around the bulk synchronous parallel (BSP) model is not well understood. In this work, we present a case study of a BSP particle-in-cell benchmark code which has been ported to a many-tasking ...
We have developed a threaded parallel data streaming approach using Globus to transfer multi-terabyt...
Many/multi-core supercomputers provide a natural programming paradigm for hybrid MPI/OpenMP scientif...
AbstractThe computational performance of a smoothed particle hydrodynamics (SPH) simulation is inves...
The Gyrokinetic Toroidal Code (GTC) uses the particle-in-cell method to efficiently simulate plasma ...
The Gyrokinetic Toroidal code (GTC) (version 2) is a 3D particle-in-cell application developed at th...
Abstract. In this work, we discuss the porting to the GPU platform of the latest production version ...
The gyrokinetic Particle-in-Cell (PIC) method is a critical computational tool enabling petascale fu...
Computationally intensive applications with frequent communication and synchronization require caref...
The Gyrokinetic Toroidal Code (GTC) is a global, three-dimensional particle-in-cell application deve...
The gyrokinetic toroidal code at Princeton (GTC-P) is a highly scalable and portable particle-in-cel...
We present multicore parallelization strategies for the particle-to-grid interpolation step in the G...
This work presents a general methodology for estimating the performance of an HPC workload when runn...
The next decade of high-performance computing (HPC) systems will see a rapid evolution and divergenc...
Gyrokinetic codes in plasma physics need outstanding computational resources to solve increasingly c...
We have developed a threaded parallel data streaming approach using Globus to transfer multi-terabyt...
We have developed a threaded parallel data streaming approach using Globus to transfer multi-terabyt...
Many/multi-core supercomputers provide a natural programming paradigm for hybrid MPI/OpenMP scientif...
AbstractThe computational performance of a smoothed particle hydrodynamics (SPH) simulation is inves...
The Gyrokinetic Toroidal Code (GTC) uses the particle-in-cell method to efficiently simulate plasma ...
The Gyrokinetic Toroidal code (GTC) (version 2) is a 3D particle-in-cell application developed at th...
Abstract. In this work, we discuss the porting to the GPU platform of the latest production version ...
The gyrokinetic Particle-in-Cell (PIC) method is a critical computational tool enabling petascale fu...
Computationally intensive applications with frequent communication and synchronization require caref...
The Gyrokinetic Toroidal Code (GTC) is a global, three-dimensional particle-in-cell application deve...
The gyrokinetic toroidal code at Princeton (GTC-P) is a highly scalable and portable particle-in-cel...
We present multicore parallelization strategies for the particle-to-grid interpolation step in the G...
This work presents a general methodology for estimating the performance of an HPC workload when runn...
The next decade of high-performance computing (HPC) systems will see a rapid evolution and divergenc...
Gyrokinetic codes in plasma physics need outstanding computational resources to solve increasingly c...
We have developed a threaded parallel data streaming approach using Globus to transfer multi-terabyt...
We have developed a threaded parallel data streaming approach using Globus to transfer multi-terabyt...
Many/multi-core supercomputers provide a natural programming paradigm for hybrid MPI/OpenMP scientif...
AbstractThe computational performance of a smoothed particle hydrodynamics (SPH) simulation is inves...