NestStep is a collection of parallel extensions to existing programming languages. These extensions supports a shared memory model and nested parallelism. NestStep is based the Bulk-Synchronous Programming model. Most of the communication of data in NestStep takes place in a combine/commit phase, which is essentially a reduction followed by a broadcast. The primary aim of the project that this thesis is based on was to develop a runtime system for NestStep-C, the extensions for the C programming language. The secondary aim was to find which tree structure among a selected few is the best for communicating data in the combine/commit phase. This thesis includes information about NestStep, how to interface with the NestStep runtime system, som...
Abstract—We propose a cooperation between the programmer, the compiler and the runtime system to ide...
[[abstract]]Minimizing interprocessor communication is the key to a parallelized program on executio...
International audienceTo amortize the cost of MPI collective operations, non-blocking collectives ha...
The goal of this project is to create a source-to-source compiler which will translate NestStep code...
A runtime system provides a parallel language compiler with an interface to the low-level facilities...
Efficiently using multicore architectures demands an increasing degree of fluency in parallel progra...
A runtime system provides a parallel language compiler with an interface to the low-level facilities...
A variety of historically-proven computer languages have recently been extended to support parallel ...
This paper describes the design and implementation of a scalable run-time system and an optimizing c...
In the field of scientific computing, the use of parallelism has led to widespread improvements in t...
We introduce a shared memory software prototype system for executing programs with nested parallelis...
This article describes a technique for path unfolding for conditional branches in parallel programs ...
Abstract: PC clusters have become popular in parallel processing. They do not involve specialized in...
The Cell Broadband Engine processor is a powerful processor capable of over 220 GFLOPS. It is highly...
Data parallel languages are gaining interest as it becomes clear that they support a wider range of ...
Abstract—We propose a cooperation between the programmer, the compiler and the runtime system to ide...
[[abstract]]Minimizing interprocessor communication is the key to a parallelized program on executio...
International audienceTo amortize the cost of MPI collective operations, non-blocking collectives ha...
The goal of this project is to create a source-to-source compiler which will translate NestStep code...
A runtime system provides a parallel language compiler with an interface to the low-level facilities...
Efficiently using multicore architectures demands an increasing degree of fluency in parallel progra...
A runtime system provides a parallel language compiler with an interface to the low-level facilities...
A variety of historically-proven computer languages have recently been extended to support parallel ...
This paper describes the design and implementation of a scalable run-time system and an optimizing c...
In the field of scientific computing, the use of parallelism has led to widespread improvements in t...
We introduce a shared memory software prototype system for executing programs with nested parallelis...
This article describes a technique for path unfolding for conditional branches in parallel programs ...
Abstract: PC clusters have become popular in parallel processing. They do not involve specialized in...
The Cell Broadband Engine processor is a powerful processor capable of over 220 GFLOPS. It is highly...
Data parallel languages are gaining interest as it becomes clear that they support a wider range of ...
Abstract—We propose a cooperation between the programmer, the compiler and the runtime system to ide...
[[abstract]]Minimizing interprocessor communication is the key to a parallelized program on executio...
International audienceTo amortize the cost of MPI collective operations, non-blocking collectives ha...