The challenge of exploiting high degrees of instruction-level parallelism is often hampered by frequent branching. Both exposed branch latency and low branch throughput can restrict parallelism. Control critical path reduction (control CPR) is a compilation technique to address these problems. Control CPR can reduce the dependence height of critical paths through branch operations as well as decrease the number of executed branches. In this paper, we present an approach to control CPR that recognizes sequences of branches using profiling statistics. The control CPR transformation is applied to the predominant path through this sequence. Our approach, its implementation, and experimental results are presented. This work demonstrates that co...
In this paper we describe a Configuration PRofiling tool (CPR) and show how it can be used to aid co...
textTo make progress in the face of failures, long-running parallel applications need to save their ...
This paper presents the concept of dynamic control independence (DCI) and shows how it can be detect...
control dependences, recurrences, parallelism, control height reduction, back-substitution, blocked ...
Conditional branches are expensive. Branches require a significant percentage of execution cycles si...
Abstract Profile-based optimizations can be used for instruction scheduling, loop scheduling, data p...
Abstract—Mobile and PC/server class processor companies continue to roll out flagship core microarch...
Branch effects are the biggest obstacle to gaining significant speedups when running general-purpose...
Large instruction window processors achieve high performance by exposing large amounts of instructio...
Pipelined microprocessors allow the simultaneous execution of several machine instructions at a time...
T here is an insatiable demand for computers ofever-increasing performance. Old applicationsare appl...
Irregular control-flow structures like deeply nested conditional branches are common in real-world s...
Procedures are the basic units of compilation in traditional optimization frameworks. This presents ...
A programming tool that performs analysis of critical paths for parallel programs has been developed...
Modern CPUs rely on expensive branch predictors to speed up execution. Predictions nevertheless impl...
In this paper we describe a Configuration PRofiling tool (CPR) and show how it can be used to aid co...
textTo make progress in the face of failures, long-running parallel applications need to save their ...
This paper presents the concept of dynamic control independence (DCI) and shows how it can be detect...
control dependences, recurrences, parallelism, control height reduction, back-substitution, blocked ...
Conditional branches are expensive. Branches require a significant percentage of execution cycles si...
Abstract Profile-based optimizations can be used for instruction scheduling, loop scheduling, data p...
Abstract—Mobile and PC/server class processor companies continue to roll out flagship core microarch...
Branch effects are the biggest obstacle to gaining significant speedups when running general-purpose...
Large instruction window processors achieve high performance by exposing large amounts of instructio...
Pipelined microprocessors allow the simultaneous execution of several machine instructions at a time...
T here is an insatiable demand for computers ofever-increasing performance. Old applicationsare appl...
Irregular control-flow structures like deeply nested conditional branches are common in real-world s...
Procedures are the basic units of compilation in traditional optimization frameworks. This presents ...
A programming tool that performs analysis of critical paths for parallel programs has been developed...
Modern CPUs rely on expensive branch predictors to speed up execution. Predictions nevertheless impl...
In this paper we describe a Configuration PRofiling tool (CPR) and show how it can be used to aid co...
textTo make progress in the face of failures, long-running parallel applications need to save their ...
This paper presents the concept of dynamic control independence (DCI) and shows how it can be detect...