Large-scale shared-memory multiprocessors typically have long latencies for remote data accesses. A key issue for execution performance of many common applications is the synchronization cost. The communication scalability of synchronization has been improved by the introduction of queue-based spin-locks instead of Test & (Test & Set). For architectures with long access latencies for global data, attention should also be paid to the number of global accesses that are involved in synchronization. We present a method to characterize the performance of proposed queue lock algorithms, and apply it to previously published algorithms. We also present two new queue locks, the LH lock and the M lock. We compare the locks in terms of per...
EjFcient synchronization primitives are essential for achieving high performance in he-grain, shared...
We present a fast and scalable lock algorithm for shared-memory multiprocessors addressing the resou...
International audienceA plethora of optimized mutex lock algorithms have been designed over the past...
Large-scale shared-memory multiprocessors typically have long latencies for remote data accesses. A...
Journal ArticleShared memory programs guarantee the correctness of concurrent accesses to shared dat...
Scalable busy-wait synchronization algorithms are essential for achieving good parallel program perf...
Synchronization primitives for large scale multiprocessors need to provide low latency and low conte...
Efficient synchronization is important for achieving good performance in parallel programs, especial...
Efficient synchronization is important for achieving good performance in parallel programs, especial...
Link to published version: http://ieeexplore.ieee.org/iel3/4440/12600/00580906.pdf?tp=&arnumber=5809...
Scalable busy-wait synchronization algorithms are essential for achieving good parallel program perf...
Busy-wait techniques are heavily used for mutual exclusion and barrier synchronization in shared-mem...
Link to published version: http://portal.acm.org/ft_gateway.cfm?id=379566&type=pdf&coll=portal&dl=AC...
In multicores, performance-critical synchronization is increasingly performed in a lock-free manner ...
Abstract. Synchronization in parallel programs is a major performance bottleneck. Shared data is pro...
EjFcient synchronization primitives are essential for achieving high performance in he-grain, shared...
We present a fast and scalable lock algorithm for shared-memory multiprocessors addressing the resou...
International audienceA plethora of optimized mutex lock algorithms have been designed over the past...
Large-scale shared-memory multiprocessors typically have long latencies for remote data accesses. A...
Journal ArticleShared memory programs guarantee the correctness of concurrent accesses to shared dat...
Scalable busy-wait synchronization algorithms are essential for achieving good parallel program perf...
Synchronization primitives for large scale multiprocessors need to provide low latency and low conte...
Efficient synchronization is important for achieving good performance in parallel programs, especial...
Efficient synchronization is important for achieving good performance in parallel programs, especial...
Link to published version: http://ieeexplore.ieee.org/iel3/4440/12600/00580906.pdf?tp=&arnumber=5809...
Scalable busy-wait synchronization algorithms are essential for achieving good parallel program perf...
Busy-wait techniques are heavily used for mutual exclusion and barrier synchronization in shared-mem...
Link to published version: http://portal.acm.org/ft_gateway.cfm?id=379566&type=pdf&coll=portal&dl=AC...
In multicores, performance-critical synchronization is increasingly performed in a lock-free manner ...
Abstract. Synchronization in parallel programs is a major performance bottleneck. Shared data is pro...
EjFcient synchronization primitives are essential for achieving high performance in he-grain, shared...
We present a fast and scalable lock algorithm for shared-memory multiprocessors addressing the resou...
International audienceA plethora of optimized mutex lock algorithms have been designed over the past...