Abstract This paper proposes and evaluates new synchronization schemes for a simultaneous multithreaded processor. We present a scalable mechanism that permits threads to cheaply synchronize within the processor, with blocked threads consuming no processor resources. We also introduce the concept of lock release prediction, which gains an additional improvement of 40%. Overall, we show that these improvements in synchronization cost enable parallelization of code that could not be effectively parallelized using traditional techniques
This paper presents a novel mechanism for barrier synchronization on chip multi-processors (CMPs). B...
The only reason to parallelize a program is to gain performance. However, the synchronization primit...
The advent of chip multi-processors has led to an increase in computational performance in recent ye...
Existing multiprocessor synchronization mechanisms are relatively heavyweight, due in part to the le...
: Traditional compilation techniques for synchronization have targeted architectures with relatively...
Scalable busy-wait synchronization algorithms are essential for achieving good parallel program perf...
Scalable busy-wait synchronization algorithms are essential for achieving good parallel program perf...
Efficient synchronization is important for achieving good performance in parallel programs, especial...
EjFcient synchronization primitives are essential for achieving high performance in he-grain, shared...
Efficient synchronization is important for achieving good performance in parallel programs, especial...
Busy-wait techniques are heavily used for mutual exclusion and barrier synchronization in shared-mem...
For most multi-threaded applications, data structures must be shared between threads. Ensuring threa...
Multicore design is a major issue in modern computer architectures. Programmers are urged to design ...
In multicores, performance-critical synchronization is increasingly performed in a lock-free manner ...
Programs written in concurrent object-oriented languages, especially ones that employ threadsafe reu...
This paper presents a novel mechanism for barrier synchronization on chip multi-processors (CMPs). B...
The only reason to parallelize a program is to gain performance. However, the synchronization primit...
The advent of chip multi-processors has led to an increase in computational performance in recent ye...
Existing multiprocessor synchronization mechanisms are relatively heavyweight, due in part to the le...
: Traditional compilation techniques for synchronization have targeted architectures with relatively...
Scalable busy-wait synchronization algorithms are essential for achieving good parallel program perf...
Scalable busy-wait synchronization algorithms are essential for achieving good parallel program perf...
Efficient synchronization is important for achieving good performance in parallel programs, especial...
EjFcient synchronization primitives are essential for achieving high performance in he-grain, shared...
Efficient synchronization is important for achieving good performance in parallel programs, especial...
Busy-wait techniques are heavily used for mutual exclusion and barrier synchronization in shared-mem...
For most multi-threaded applications, data structures must be shared between threads. Ensuring threa...
Multicore design is a major issue in modern computer architectures. Programmers are urged to design ...
In multicores, performance-critical synchronization is increasingly performed in a lock-free manner ...
Programs written in concurrent object-oriented languages, especially ones that employ threadsafe reu...
This paper presents a novel mechanism for barrier synchronization on chip multi-processors (CMPs). B...
The only reason to parallelize a program is to gain performance. However, the synchronization primit...
The advent of chip multi-processors has led to an increase in computational performance in recent ye...