Conventional wisdom in designing concurrent data structures is to use the most powerful synchronization primitive, namely compare-and-swap (CAS), and to avoid contended hot spots. In building concurrent FIFO queues, this reasoning has led re-searchers to propose combining-based concurrent queues. This paper takes a different approach, showing how to rely on fetch-and-add (F&A), a less powerful primitive that is available on x86 processors, to construct a nonblocking (lock-free) lineariz-able concurrent FIFO queue which, despite the F&A being a con-tended hot spot, outperforms combining-based implementations by 1.5 × to 2.5 × in all concurrency levels on an x86 server with four multicore processors, in both single-processor and multi...
This note describes a proposed extension to the architecture of shared memory multiprocessors with c...
Most multiprocessors are multiprogrammed to achieve accept-able response time. Unfortunately, inoppo...
Synchronization of concurrent threads is the central problem in order to design efficient concurrent...
A non-blocking FIFO queue algorithm for multiprocessor shared memory systems is presented in this pa...
As core counts increase and as heterogeneity becomes more common in parallel computing, we face the ...
In order to guarantee that each method of a data structure updates the logical state exactly once, a...
Link to published version: http://portal.acm.org/ft_gateway.cfm?id=248106&type=pdf&coll=portal&dl=AC...
Designing and implementing high-performance concurrent data structures whose access performance scal...
We present a new lock-free multiple-producer and multiple-consumer (MPMC) FIFO queue design which is...
Concurrent access to shared data in preemptive multi-tasks environment and in multi-processors archi...
Abstract. We introduce fast and scalable algorithms that implement bounded-and unbounded-size lock-f...
The fetch-and-add (F&A) operation has been used effectively in a number of process coordination ...
With the growing use of multiprocessors, data structures that support concurrent operations have be...
Most multiprocessors are multiprogrammed to achieve acceptable response time. Unfortunately, inoppor...
Core-to-core communication is critical to the effective use of multi-core processors. A number of so...
This note describes a proposed extension to the architecture of shared memory multiprocessors with c...
Most multiprocessors are multiprogrammed to achieve accept-able response time. Unfortunately, inoppo...
Synchronization of concurrent threads is the central problem in order to design efficient concurrent...
A non-blocking FIFO queue algorithm for multiprocessor shared memory systems is presented in this pa...
As core counts increase and as heterogeneity becomes more common in parallel computing, we face the ...
In order to guarantee that each method of a data structure updates the logical state exactly once, a...
Link to published version: http://portal.acm.org/ft_gateway.cfm?id=248106&type=pdf&coll=portal&dl=AC...
Designing and implementing high-performance concurrent data structures whose access performance scal...
We present a new lock-free multiple-producer and multiple-consumer (MPMC) FIFO queue design which is...
Concurrent access to shared data in preemptive multi-tasks environment and in multi-processors archi...
Abstract. We introduce fast and scalable algorithms that implement bounded-and unbounded-size lock-f...
The fetch-and-add (F&A) operation has been used effectively in a number of process coordination ...
With the growing use of multiprocessors, data structures that support concurrent operations have be...
Most multiprocessors are multiprogrammed to achieve acceptable response time. Unfortunately, inoppor...
Core-to-core communication is critical to the effective use of multi-core processors. A number of so...
This note describes a proposed extension to the architecture of shared memory multiprocessors with c...
Most multiprocessors are multiprogrammed to achieve accept-able response time. Unfortunately, inoppo...
Synchronization of concurrent threads is the central problem in order to design efficient concurrent...