Abstract. Modern processors use speculative execution to improve performance. However, speculative execution requires a checkpoint/restore mechanism to repair the machine’s state whenever speculation fails. Existing checkpoint/restore mechanisms do not scale well for processors with relatively large windows (i.e., 128 or more). This work presents Turbo-ROB, a checkpoint/restore recovery accelerator that can complement or replace existing checkpoint/restore mechanisms. We show that the Turbo-ROB improves performance and reduces resource requirements compared to a conventional Re-order Buffer mechanism. For example, on the average, a 64-entry TROB matches the performance of a 512-entry ROB, while a 128- and a 512-entry TROB outperform the 512...
Checkpointed Early Resource Recycling (Cherry) is a recently-proposed micro-architectural technique ...
High-frequency memory checkpointing is an important technique in several application domains, such a...
This paper revisits replication coupled with checkpointing for fail-stop errors.Replication enables ...
This Technical Report was sent to Advisory Committee of MICRO-40 (June 8th, 2007) for review and pub...
Large instruction window processors achieve high performance by exposing large amounts of instructio...
Current superscalar processors use a Reorder Buffer (ROB) to support speculation, precise exceptions...
Superscalar processors take advantage of speculative execution to improve performance. When the spec...
Several processor architectures with large instruction windows have been proposed. They improve perf...
This is a post-peer-review, pre-copyedit version of an article published in New Generation Computing...
This paper presents ReVive, a novel general-purpose rollback recovery mechanism for shared-memory mu...
The increasing number of cores on current supercomputers will quickly decrease the mean time to fail...
Processor architectures with large instruction windows have been proposed to expose more instruction...
Checkpointing schemes enable fault-tolerant parallel and distributed computing by leveraging the red...
Modern out-of-order processors tolerate long latency memory operations by supporting a large number ...
textTo make progress in the face of failures, long-running parallel applications need to save their ...
Checkpointed Early Resource Recycling (Cherry) is a recently-proposed micro-architectural technique ...
High-frequency memory checkpointing is an important technique in several application domains, such a...
This paper revisits replication coupled with checkpointing for fail-stop errors.Replication enables ...
This Technical Report was sent to Advisory Committee of MICRO-40 (June 8th, 2007) for review and pub...
Large instruction window processors achieve high performance by exposing large amounts of instructio...
Current superscalar processors use a Reorder Buffer (ROB) to support speculation, precise exceptions...
Superscalar processors take advantage of speculative execution to improve performance. When the spec...
Several processor architectures with large instruction windows have been proposed. They improve perf...
This is a post-peer-review, pre-copyedit version of an article published in New Generation Computing...
This paper presents ReVive, a novel general-purpose rollback recovery mechanism for shared-memory mu...
The increasing number of cores on current supercomputers will quickly decrease the mean time to fail...
Processor architectures with large instruction windows have been proposed to expose more instruction...
Checkpointing schemes enable fault-tolerant parallel and distributed computing by leveraging the red...
Modern out-of-order processors tolerate long latency memory operations by supporting a large number ...
textTo make progress in the face of failures, long-running parallel applications need to save their ...
Checkpointed Early Resource Recycling (Cherry) is a recently-proposed micro-architectural technique ...
High-frequency memory checkpointing is an important technique in several application domains, such a...
This paper revisits replication coupled with checkpointing for fail-stop errors.Replication enables ...