The Gyrokinetic Toroidal Code (GTC) uses the particle-in-cell method to efficiently simulate plasma microturbulence. This work presents novel analysis and optimization techniques to enhance the performance of GTC on large-scale machines. We introduce cell access analysis to better manage locality vs. synchronization tradeoffs on CPU and GPU-based architectures. Our optimized hybrid parallel implementation of GTC uses MPI, OpenMP, and NVIDIA CUDA, achieves up to a 2× speedup over the reference Fortran version on multiple parallel systems, and scales efficiently to tens of thousands of cores. © The Author(s) 2013
We have developed a threaded parallel data streaming approach using Globus to transfer multi-terabyt...
GENE solves the five-dimensional gyrokinetic equations to simulate the development and evolution of ...
Reliable predictive simulation capability addressing confinement properties in magnetically confined...
The Gyrokinetic Toroidal Code (GTC) uses the particle-in-cell method to efficiently simulate plasma ...
The gyrokinetic Particle-in-Cell (PIC) method is a critical computational tool enabling petascale fu...
Abstract. In this work, we discuss the porting to the GPU platform of the latest production version ...
The next decade of high-performance computing (HPC) systems will see a rapid evolution and divergenc...
We present multicore parallelization strategies for the particle-to-grid interpolation step in the G...
The Gyrokinetic Toroidal code (GTC) (version 2) is a 3D particle-in-cell application developed at th...
The Gyrokinetic Toroidal Code (GTC) is a global, three-dimensional particle-in-cell application deve...
The gyrokinetic toroidal code at Princeton (GTC-P) is a highly scalable and portable particle-in-cel...
Abstract—Conventional programming practices on multicore processors in high performance computing ar...
Gyrokinetic codes in plasma physics need outstanding computational resources to solve increasingly c...
We present novel optimizations of the fusion plasmas simulation code, GTC on Tianhe-2 supercomputer....
We have developed a threaded parallel data streaming approach using Globus to transfer multi-terabyt...
We have developed a threaded parallel data streaming approach using Globus to transfer multi-terabyt...
GENE solves the five-dimensional gyrokinetic equations to simulate the development and evolution of ...
Reliable predictive simulation capability addressing confinement properties in magnetically confined...
The Gyrokinetic Toroidal Code (GTC) uses the particle-in-cell method to efficiently simulate plasma ...
The gyrokinetic Particle-in-Cell (PIC) method is a critical computational tool enabling petascale fu...
Abstract. In this work, we discuss the porting to the GPU platform of the latest production version ...
The next decade of high-performance computing (HPC) systems will see a rapid evolution and divergenc...
We present multicore parallelization strategies for the particle-to-grid interpolation step in the G...
The Gyrokinetic Toroidal code (GTC) (version 2) is a 3D particle-in-cell application developed at th...
The Gyrokinetic Toroidal Code (GTC) is a global, three-dimensional particle-in-cell application deve...
The gyrokinetic toroidal code at Princeton (GTC-P) is a highly scalable and portable particle-in-cel...
Abstract—Conventional programming practices on multicore processors in high performance computing ar...
Gyrokinetic codes in plasma physics need outstanding computational resources to solve increasingly c...
We present novel optimizations of the fusion plasmas simulation code, GTC on Tianhe-2 supercomputer....
We have developed a threaded parallel data streaming approach using Globus to transfer multi-terabyt...
We have developed a threaded parallel data streaming approach using Globus to transfer multi-terabyt...
GENE solves the five-dimensional gyrokinetic equations to simulate the development and evolution of ...
Reliable predictive simulation capability addressing confinement properties in magnetically confined...