Modern High Performance Computing (HPC) systems are complex, with deep memory hierarchies and increasing use of computational heterogeneity via accelerators. When developing applications for these platforms, programmers are faced with two bad choices. On one hand, they can explicitly manage machine resources, writing programs using low level primitives from multiple APIs (e.g., MPI+OpenMP), creating efficient but rigid, difficult to extend, and non-portable implementations. Alternatively, users can adopt higher level programming environments, often at the cost of lost performance. Our approach is to maintain the high level nature of the application without sacrificing performance by relying on the transfer of high level, application semanti...
Widespread adoption of parallel computing depends on the availability of improved software environme...
Communication is an important but difficult aspect of parallel programming. This paper describes a p...
As the demand increases for high performance and power efficiency in modern computer runtime systems...
The Standard Template Adaptive Parallel Library (STAPL) is a parallel programming infrastructure tha...
High Performance Computing (HPC) has always been a key foundation for scientific simulation and disc...
Multicore chips have become the standard building blocks for all current and future massively parall...
Languages and tools currently available for the development of parallel applications are difficult t...
We present the design and implementation of the Standard Template Adap- tive Parallel Library (stapl...
This work describes my solution to the performance portability problem: between CPUs and GPUs in par...
Hybrid parallel programming models that combine message passing (MP) and shared- memory multithreadi...
Since the invention of the transistor, clock frequency increase was the primary method of improving ...
The current trends in high performance computing show that large machines with tens of thousands of ...
Recent developments in supercomputing have brought us massively parallel machines. With the number o...
Parallel machines with an extremely large number of processors (at least tens of thousands processor...
In parallel programming, a concurrent container usually distributes its elements to all processing u...
Widespread adoption of parallel computing depends on the availability of improved software environme...
Communication is an important but difficult aspect of parallel programming. This paper describes a p...
As the demand increases for high performance and power efficiency in modern computer runtime systems...
The Standard Template Adaptive Parallel Library (STAPL) is a parallel programming infrastructure tha...
High Performance Computing (HPC) has always been a key foundation for scientific simulation and disc...
Multicore chips have become the standard building blocks for all current and future massively parall...
Languages and tools currently available for the development of parallel applications are difficult t...
We present the design and implementation of the Standard Template Adap- tive Parallel Library (stapl...
This work describes my solution to the performance portability problem: between CPUs and GPUs in par...
Hybrid parallel programming models that combine message passing (MP) and shared- memory multithreadi...
Since the invention of the transistor, clock frequency increase was the primary method of improving ...
The current trends in high performance computing show that large machines with tens of thousands of ...
Recent developments in supercomputing have brought us massively parallel machines. With the number o...
Parallel machines with an extremely large number of processors (at least tens of thousands processor...
In parallel programming, a concurrent container usually distributes its elements to all processing u...
Widespread adoption of parallel computing depends on the availability of improved software environme...
Communication is an important but difficult aspect of parallel programming. This paper describes a p...
As the demand increases for high performance and power efficiency in modern computer runtime systems...