The performance effect of permitting different memory operations to be re-ordered is examined. The available parallelism is computed using a machine code simulator. A range of possible restrictions on the re-ordering of memory operations is considered: from the purely sequential case where no re-ordering is permitted; to the completely permissive one where memory operations may occur in any order so that the parallelism is restricted only by data dependencies. A general conclusion is drawn that to reliably obtain parallelism beyond 10 instructions per clock will require an ability to re-order all memory instructions. A brief description of a feasible architecture capable of this is given
Several parallel parallel processing systems exist that can be partitioned and/or can operate in mul...
The use of large instruction windows coupled with aggressive out-of order and prefetching capabiliti...
Journal PaperCurrent microprocessors incorporate techniques to exploit instruction-level parallelism...
The performance effect of permitting different memory operations to be re-ordered is examined. The a...
High performance computer architectures increasingly use compile-time instruction scheduling to reor...
The problem of extracting Instruction Level Parallelism at levels of 10 instructions per clock and h...
Shared memory has been widely adopted as the primary system level programming abstraction on modern ...
The problem of extracting InstructionLevel Parallelism at levels of 10 instructionsper clock and hig...
The increasing density of VLSI circuits has motivated research into ways to utilize large area budge...
International audienceModern multicore processor architectures and compilers of shared-memory concur...
There have been many recent studies of the "limits on instruction parallelism" in applicat...
To exploit instruction level parallelism, it is important not only to execute multiple memory refere...
The use of large instruction windows coupled with aggressive out-of-order and prefetching capabiliti...
In this paper, we study the impact of synchronization and granularity on the performance of parallel...
This paper discusses memory consistency models and their influence on software in the context of par...
Several parallel parallel processing systems exist that can be partitioned and/or can operate in mul...
The use of large instruction windows coupled with aggressive out-of order and prefetching capabiliti...
Journal PaperCurrent microprocessors incorporate techniques to exploit instruction-level parallelism...
The performance effect of permitting different memory operations to be re-ordered is examined. The a...
High performance computer architectures increasingly use compile-time instruction scheduling to reor...
The problem of extracting Instruction Level Parallelism at levels of 10 instructions per clock and h...
Shared memory has been widely adopted as the primary system level programming abstraction on modern ...
The problem of extracting InstructionLevel Parallelism at levels of 10 instructionsper clock and hig...
The increasing density of VLSI circuits has motivated research into ways to utilize large area budge...
International audienceModern multicore processor architectures and compilers of shared-memory concur...
There have been many recent studies of the "limits on instruction parallelism" in applicat...
To exploit instruction level parallelism, it is important not only to execute multiple memory refere...
The use of large instruction windows coupled with aggressive out-of-order and prefetching capabiliti...
In this paper, we study the impact of synchronization and granularity on the performance of parallel...
This paper discusses memory consistency models and their influence on software in the context of par...
Several parallel parallel processing systems exist that can be partitioned and/or can operate in mul...
The use of large instruction windows coupled with aggressive out-of order and prefetching capabiliti...
Journal PaperCurrent microprocessors incorporate techniques to exploit instruction-level parallelism...