Efficient synchronization can dramatically improve the performance of shared-memory parallel programs. Past work has proposed pairwise synchronization primitives---such as Queue-on-Lock-Bit (QOLB) [6, 5] and the transactional memory model [7, 26]---that offer higher performance than currently widely used primitives--- such as MCS locks [16]. QOLB and the transactional memory model, as originally conceived, require modifications to commodity processors, and thus have not yet been implemented. In this report, we present an implementation of QOLB, called SOFTQOLB, that runs entirely in software, and can thus run on unmodified commodity workstation clusters. We describe our implementation in detail and present an evaluation of its performance u...
Link to published version: http://ieeexplore.ieee.org/iel3/4440/12600/00580906.pdf?tp=&arnumber=5809...
Shared memory multiprocessor systems typically provide a set of hardware primitives in order to supp...
On shared memory multiprocessors, synchronization often turns out to be a performance bottleneck and...
EjFcient synchronization primitives are essential for achieving high performance in he-grain, shared...
Large-scale shared-memory multiprocessors typically have long latencies for remote data accesses. A...
This paper addresses the problem of universal synchronizationprimitives that can support scalable th...
his paper addresses the problem of universal synchronization primitives that can support scalable th...
The thesis investigates non-blocking synchronization in shared memory systems, in particular in high...
The thesis investigates non-blocking synchronization in shared memory systems, in particular in high...
We introduce Transient Blocking Synchronization (TBS), a new approach to hardware synchronization fo...
We introduce a non-blocking full/empty bit primitive, or NB-FEB for short, as a promising synchroniz...
Busy-wait techniques are heavily used for mutual exclusion and barrier synchronization in shared-mem...
International audienceEach generation of shared memory Multi-Processor System-on-Chips (MPSoCs) tend...
This paper investigates optimized synchronization techniques for shared memory on-chip multiprocesso...
Programs written in concurrent object-oriented languages, especially ones that employ threadsafe reu...
Link to published version: http://ieeexplore.ieee.org/iel3/4440/12600/00580906.pdf?tp=&arnumber=5809...
Shared memory multiprocessor systems typically provide a set of hardware primitives in order to supp...
On shared memory multiprocessors, synchronization often turns out to be a performance bottleneck and...
EjFcient synchronization primitives are essential for achieving high performance in he-grain, shared...
Large-scale shared-memory multiprocessors typically have long latencies for remote data accesses. A...
This paper addresses the problem of universal synchronizationprimitives that can support scalable th...
his paper addresses the problem of universal synchronization primitives that can support scalable th...
The thesis investigates non-blocking synchronization in shared memory systems, in particular in high...
The thesis investigates non-blocking synchronization in shared memory systems, in particular in high...
We introduce Transient Blocking Synchronization (TBS), a new approach to hardware synchronization fo...
We introduce a non-blocking full/empty bit primitive, or NB-FEB for short, as a promising synchroniz...
Busy-wait techniques are heavily used for mutual exclusion and barrier synchronization in shared-mem...
International audienceEach generation of shared memory Multi-Processor System-on-Chips (MPSoCs) tend...
This paper investigates optimized synchronization techniques for shared memory on-chip multiprocesso...
Programs written in concurrent object-oriented languages, especially ones that employ threadsafe reu...
Link to published version: http://ieeexplore.ieee.org/iel3/4440/12600/00580906.pdf?tp=&arnumber=5809...
Shared memory multiprocessor systems typically provide a set of hardware primitives in order to supp...
On shared memory multiprocessors, synchronization often turns out to be a performance bottleneck and...