Through analysis and experiments, this paper investigates two-phase waiting algorithms to minimize the cost of waiting for synchronization in large-scale multiprocessors. In a two-phase algorithm, a thread first waits by polling a synchronization variable. If the cost of polling reaches a limit L poll and further waiting is necessary, the thread is blocked, incurring an additional fixed cost, B. The choice of L poll is a critical determinant of the performance of two-phase algorithms. We focus on methods for statically determining L poll because the run-time overhead of dynamically determining L poll can be comparable to the cost of blocking in large-scale multiprocessor systems with lightweight threads. Our experiments show that always-blo...
Block multithreaded architectures tolerate large memory and synchronization latencies by switching c...
Abstract. Synchronization in parallel programs is a major performance bottleneck. Shared data is pro...
International audienceProviding high-performance synchronization mechanisms is a key issue to benefi...
Efficient synchronization is important for achieving good performance in parallel programs, especial...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
A distributed system is a group of processors that do not allocate memory. As an alternative, each p...
Barrier is widely used for synchronization in parallel programs. Since the process arrived earlier t...
Efficient synchronization is important for achieving good performance in parallel programs, especial...
Driven by growing application requirements and accelerated by current trends in microprocessor desig...
Multi-core processors are ubiquitous. Even embedded systems nowadays use processors with multiple co...
Large-scale shared-memory multiprocessors typically have long latencies for remote data accesses. A...
Due to the available concurrency in modern-day supercomputers, the complexity of developing efficien...
Busy-wait techniques are heavily used for mutual exclusion and barrier synchronization in shared-mem...
[[abstract]]A fundamental issue that any control-based synchronization should address is how to mini...
Scalable busy-wait synchronization algorithms are essential for achieving good parallel program perf...
Block multithreaded architectures tolerate large memory and synchronization latencies by switching c...
Abstract. Synchronization in parallel programs is a major performance bottleneck. Shared data is pro...
International audienceProviding high-performance synchronization mechanisms is a key issue to benefi...
Efficient synchronization is important for achieving good performance in parallel programs, especial...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
A distributed system is a group of processors that do not allocate memory. As an alternative, each p...
Barrier is widely used for synchronization in parallel programs. Since the process arrived earlier t...
Efficient synchronization is important for achieving good performance in parallel programs, especial...
Driven by growing application requirements and accelerated by current trends in microprocessor desig...
Multi-core processors are ubiquitous. Even embedded systems nowadays use processors with multiple co...
Large-scale shared-memory multiprocessors typically have long latencies for remote data accesses. A...
Due to the available concurrency in modern-day supercomputers, the complexity of developing efficien...
Busy-wait techniques are heavily used for mutual exclusion and barrier synchronization in shared-mem...
[[abstract]]A fundamental issue that any control-based synchronization should address is how to mini...
Scalable busy-wait synchronization algorithms are essential for achieving good parallel program perf...
Block multithreaded architectures tolerate large memory and synchronization latencies by switching c...
Abstract. Synchronization in parallel programs is a major performance bottleneck. Shared data is pro...
International audienceProviding high-performance synchronization mechanisms is a key issue to benefi...