This thesis investigates the performance of reducer hyperobjects, a feature of the Cilk task-parallel runtime system that enables concurrent associative updates to nonlocal variables. Reducers are more performant than more traditional methods of enabling concurrent updates, such as locking and atomic updates. Unfortunately, existing reducer implementations can suffer a cost of up to 10 times that of a serial update, depending on the benchmark. This overhead incurred by reducers can be decreased by three approaches: runtime data structures, compiler-runtime integration, and compiler optimization. When these approaches are used to performance-engineer the OpenCilk runtime system's reducers, the overall performance of a benchmark suite...
In this paper we present a parallel implementation of Lévy's optimal reduction for the λ-calculus [1...
Funder: FP7 People: Marie‐Curie Actions; Id: http://dx.doi.org/10.13039/100011264; Grant(s): 327744S...
Efficiently using multicore architectures demands an increasing degree of fluency in parallel progra...
Reducer hyperobjects (reducers) provide a linguistic abstraction for dynamic multithreading that all...
This paper introduces hyperobjects, a linguistic mechanism that al-lows different branches of a mult...
The availability of multicore processors across a wide range of computing platforms has created a st...
This thesis describes Cilk, a parallel multithreaded language for programming contemporary shared me...
A parallel program consists of sets of concurrent and sequential tasks. Often, a reduction (such as ...
The power, frequency, and memory wall problems have caused a major shift in mainstream computing by ...
In this paper we analyze the effect of compiler optimizations on fine grain parallelism in scalar pr...
A multithreaded Cilk program that is ostensibly deterministic may nevertheless behave nondeterminist...
Reduction recognition and optimization are crucial techniques in parallelizing compilers. They are u...
High level programming language features have long been seen as improving programmer efficiency at s...
As the demand increases for high performance and power efficiency in modern computer runtime systems...
Two Constraint Handling Rules compiler optimizations that drastically reduce the memory footprint of...
In this paper we present a parallel implementation of Lévy's optimal reduction for the λ-calculus [1...
Funder: FP7 People: Marie‐Curie Actions; Id: http://dx.doi.org/10.13039/100011264; Grant(s): 327744S...
Efficiently using multicore architectures demands an increasing degree of fluency in parallel progra...
Reducer hyperobjects (reducers) provide a linguistic abstraction for dynamic multithreading that all...
This paper introduces hyperobjects, a linguistic mechanism that al-lows different branches of a mult...
The availability of multicore processors across a wide range of computing platforms has created a st...
This thesis describes Cilk, a parallel multithreaded language for programming contemporary shared me...
A parallel program consists of sets of concurrent and sequential tasks. Often, a reduction (such as ...
The power, frequency, and memory wall problems have caused a major shift in mainstream computing by ...
In this paper we analyze the effect of compiler optimizations on fine grain parallelism in scalar pr...
A multithreaded Cilk program that is ostensibly deterministic may nevertheless behave nondeterminist...
Reduction recognition and optimization are crucial techniques in parallelizing compilers. They are u...
High level programming language features have long been seen as improving programmer efficiency at s...
As the demand increases for high performance and power efficiency in modern computer runtime systems...
Two Constraint Handling Rules compiler optimizations that drastically reduce the memory footprint of...
In this paper we present a parallel implementation of Lévy's optimal reduction for the λ-calculus [1...
Funder: FP7 People: Marie‐Curie Actions; Id: http://dx.doi.org/10.13039/100011264; Grant(s): 327744S...
Efficiently using multicore architectures demands an increasing degree of fluency in parallel progra...