Unstructured-mesh based numerical algorithms such as finite volume and finite element algorithms form an important class of applications for many scientific and engineering domains. The key difficulty in achieving higher performance from these applications is the indirect accesses that lead to data-races when parallelized. Current methods for handling such data-races lead to reduced parallelism and suboptimal performance. Particularly on modern many-core architectures, such as GPUs, that has increasing core/thread counts, reducing data movement and exploiting memory locality is vital for gaining good performance. In this work we present novel locality-exploiting optimizations for the efficient execution of unstructured-mesh algorithms on GP...
General-purpose computing on GPUs is widely adopted for scientific applications, providing inexpensi...
This work presents a parallel implementation of density-based topology optimization using distribute...
Domain decomposition based on spatial locality is a classical data-parallel problem whose solution m...
Unstructured-mesh based numerical algorithms such as finite volume and finite element algorithms for...
AbstractThis paper addresses two key parallelization challenges the unstructured mesh-based ocean mo...
Abstract The present work investigates the feasibility of finite element methods and topology optimi...
Abstract In unstructured finite volume method, loop on different mesh components such as cells, face...
Programming models such as CUDA and OpenCL allow the programmer to specify the independence of threa...
Many numerical optimisation problems rely on fast algorithms for solving sparse triangular systems o...
Graphical processing units (GPUs) have recently attracted attention for scientific applications such...
The massive parallelism provided by general-purpose GPUs (GPGPUs) possessing numerous compute thread...
This paper presents GPU parallelization for a computational fluid dynamics solver which works on a m...
Achieving optimal performance on the latest multi-core and many-core architectures increasingly depe...
Applications that operate on meshes are very popular in High Performance Computing (HPC) environment...
Graphics Processing Units (GPUs) are a fast evolving architecture. Over the last decade their progra...
General-purpose computing on GPUs is widely adopted for scientific applications, providing inexpensi...
This work presents a parallel implementation of density-based topology optimization using distribute...
Domain decomposition based on spatial locality is a classical data-parallel problem whose solution m...
Unstructured-mesh based numerical algorithms such as finite volume and finite element algorithms for...
AbstractThis paper addresses two key parallelization challenges the unstructured mesh-based ocean mo...
Abstract The present work investigates the feasibility of finite element methods and topology optimi...
Abstract In unstructured finite volume method, loop on different mesh components such as cells, face...
Programming models such as CUDA and OpenCL allow the programmer to specify the independence of threa...
Many numerical optimisation problems rely on fast algorithms for solving sparse triangular systems o...
Graphical processing units (GPUs) have recently attracted attention for scientific applications such...
The massive parallelism provided by general-purpose GPUs (GPGPUs) possessing numerous compute thread...
This paper presents GPU parallelization for a computational fluid dynamics solver which works on a m...
Achieving optimal performance on the latest multi-core and many-core architectures increasingly depe...
Applications that operate on meshes are very popular in High Performance Computing (HPC) environment...
Graphics Processing Units (GPUs) are a fast evolving architecture. Over the last decade their progra...
General-purpose computing on GPUs is widely adopted for scientific applications, providing inexpensi...
This work presents a parallel implementation of density-based topology optimization using distribute...
Domain decomposition based on spatial locality is a classical data-parallel problem whose solution m...