Various memory consistency model implementations (e.g., x86, SPARC) willfully allow a core to see its own stores while they are in limbo, i.e., executed (and perhaps retired) but not yet inserted in memory order. This is known as store-to-load forwarding and it is a necessity to safeguard the local thread's sequential program semantics while achieving high performance. However, this can lead to counter-intuitive behaviours, requiring fences to prevent such behaviours when needed.Other vendors (e.g., IBM 370 and the z/Architecture series) opt for enforcing what we call in this work store atomicity, that is, disallowing a core to see its own stores before they are written to memory, trading off performance for a more intuitive memory model. I...
For performance reasons, modern multiprocessors implement relaxed memory consistency models that adm...
Transactional memory systems promise to simplify parallel programming by avoiding deadlock, livelock...
Modern out-of-order processor architectures focus significantly on the high performance execution of...
Various memory consistency model implementations (e.g., x86, SPARC) willfully allow a core to see it...
AbstractWe extend the notion of Store Atomicity [Arvind and Jan-Willem Maessen. Memory model = instr...
Robustness is a correctness notion for concurrent programs running under relaxed consistency models....
Abstract. When verifying a concurrent program, it is usual to assume that memory is sequentially con...
We present a novel framework for defining memory models in terms of two properties: thread-local Ins...
In an out-of-order core, the load queue (LQ), the store queue (SQ), and the store buffer (SB) are re...
Store misses cause significant delays in shared-memory multiprocessors because of limited store buff...
We present a non-speculative solution for a coalescing store buffer in total store order (TSO) consi...
Speculative parallelization (SP) enables a processor to extract multiple threads from a single seque...
Writing shared-memory parallel programs is an error-prone process. Atomicity violations are especial...
Abstract. We study two operational semantics for relaxed memory models. Our first formalization is b...
This work presents BMW, a new design for speculative implementations of memory consistency models in...
For performance reasons, modern multiprocessors implement relaxed memory consistency models that adm...
Transactional memory systems promise to simplify parallel programming by avoiding deadlock, livelock...
Modern out-of-order processor architectures focus significantly on the high performance execution of...
Various memory consistency model implementations (e.g., x86, SPARC) willfully allow a core to see it...
AbstractWe extend the notion of Store Atomicity [Arvind and Jan-Willem Maessen. Memory model = instr...
Robustness is a correctness notion for concurrent programs running under relaxed consistency models....
Abstract. When verifying a concurrent program, it is usual to assume that memory is sequentially con...
We present a novel framework for defining memory models in terms of two properties: thread-local Ins...
In an out-of-order core, the load queue (LQ), the store queue (SQ), and the store buffer (SB) are re...
Store misses cause significant delays in shared-memory multiprocessors because of limited store buff...
We present a non-speculative solution for a coalescing store buffer in total store order (TSO) consi...
Speculative parallelization (SP) enables a processor to extract multiple threads from a single seque...
Writing shared-memory parallel programs is an error-prone process. Atomicity violations are especial...
Abstract. We study two operational semantics for relaxed memory models. Our first formalization is b...
This work presents BMW, a new design for speculative implementations of memory consistency models in...
For performance reasons, modern multiprocessors implement relaxed memory consistency models that adm...
Transactional memory systems promise to simplify parallel programming by avoiding deadlock, livelock...
Modern out-of-order processor architectures focus significantly on the high performance execution of...