International audienceChase and Lev's concurrent deque is a key data structure in shared- memory parallel programming and plays an essential role in work- stealing schedulers. We provide the first correctness proof of an optimized implementation of Chase and Lev's deque on top of the POWER and ARM architectures: these provide very relaxed mem- ory models, which we exploit to improve performance but consider- ably complicate the reasoning. We also study an optimized x86 and a portable C11 implementation, conducting systematic experiments to evaluate the impact of memory barrier optimizations. Our results demonstrate the benefits of hand tuning the deque code when run- ning on top of relaxed memory models
International audienceMemory models define an interface between programs written in some language an...
Weakestmo is a recently proposed memory consistency model that uses event structures to resolve the ...
International audienceBounded single-producer single-consumer FIFO queues are one of the simplest co...
This is the full version of the paper, which includes more detailed proofs than the conference versi...
Work-stealing is an efficient method to implement load balancing in fine-grained task parallelism. T...
This paper describes Coq libraries devoted to the semantic of relaxed memory models. These libraries...
International audienceWe present a class of relaxed memory models, defined in Coq, parame-terised by...
We prove the correctness of the concurrent deque component of a recent implementation of the work-st...
We present a work-stealing algorithm for total-store memory architectures, such as Intel's X86, that...
International audienceThere is a joke where a physicist and a mathematician are asked to herd cats. ...
Load balancing is a technique which allows efficient parallelization of irregular workloads, and a k...
The fork-join paradigm of concurrent expression has gained popularity in conjunction with work-steal...
International audienceConcurrent programs running on weak memory models exhibit re-laxed behaviours,...
Dynamic parallel scheduling using work-stealing has gained popularity in academia and industry for i...
This paper studies the data locality of the work-stealing scheduling algorithm on hardware-controlle...
International audienceMemory models define an interface between programs written in some language an...
Weakestmo is a recently proposed memory consistency model that uses event structures to resolve the ...
International audienceBounded single-producer single-consumer FIFO queues are one of the simplest co...
This is the full version of the paper, which includes more detailed proofs than the conference versi...
Work-stealing is an efficient method to implement load balancing in fine-grained task parallelism. T...
This paper describes Coq libraries devoted to the semantic of relaxed memory models. These libraries...
International audienceWe present a class of relaxed memory models, defined in Coq, parame-terised by...
We prove the correctness of the concurrent deque component of a recent implementation of the work-st...
We present a work-stealing algorithm for total-store memory architectures, such as Intel's X86, that...
International audienceThere is a joke where a physicist and a mathematician are asked to herd cats. ...
Load balancing is a technique which allows efficient parallelization of irregular workloads, and a k...
The fork-join paradigm of concurrent expression has gained popularity in conjunction with work-steal...
International audienceConcurrent programs running on weak memory models exhibit re-laxed behaviours,...
Dynamic parallel scheduling using work-stealing has gained popularity in academia and industry for i...
This paper studies the data locality of the work-stealing scheduling algorithm on hardware-controlle...
International audienceMemory models define an interface between programs written in some language an...
Weakestmo is a recently proposed memory consistency model that uses event structures to resolve the ...
International audienceBounded single-producer single-consumer FIFO queues are one of the simplest co...