This paper proposes a new processor architecture for handling hard-to-predict branches, the diverge-merge processor (DMP). The goal of this paradigm is to eliminate branch mispredictions due to hard-to-predict dynamic branches by dynamically predicating them without requiring ISA support for predicate registers and predicated instructions. To achieve this without incurring large hardware cost and complexity, the compiler provides control-flow information by hints and the processor dynamically predicates instructions only on frequently executed program paths. The key insight behind DMP is that most control-flow graphs look and behave like simple hammock (if-else) structures when only frequently executed paths in the graphs are considered. Th...
Our goal is to dramatically increase the performance of uniprocessors through the exploitation of in...
Control divergence poses many problems in parallelizing loops. While predicated execution is commonl...
This article describes a technique for path unfolding for conditional branches in parallel programs ...
This paper proposes a new processor architecture for handling hard-to-predict branches, the diverge-...
Dynamic predication has been proposed to reduce the branch misprediction penalty due to hard-to-pred...
textEven after decades of research in branch prediction, branch predictors still remain imperfect, w...
Conditional branches are expensive. Branches require a significant percentage of execution cycles si...
Current processors exploit out-of-order execution and branch prediction to improve instruction level...
Abstract—Mobile and PC/server class processor companies continue to roll out flagship core microarch...
Abstract—Data-parallel architectures must provide efficient support for complex control-flow constru...
High performance microprocessors have relied on accurate branch predictors to maintain high instruct...
Due to limits in technology scaling, energy efficiency of logic devices is decreasing in successive...
Pipeline stalls due to branches represent one of the most significant impediments to realizing the p...
Irregular control-flow structures like deeply nested conditional branches are common in real-world s...
This paper presents the concept of dynamic control independence (DCI) and shows how it can be detect...
Our goal is to dramatically increase the performance of uniprocessors through the exploitation of in...
Control divergence poses many problems in parallelizing loops. While predicated execution is commonl...
This article describes a technique for path unfolding for conditional branches in parallel programs ...
This paper proposes a new processor architecture for handling hard-to-predict branches, the diverge-...
Dynamic predication has been proposed to reduce the branch misprediction penalty due to hard-to-pred...
textEven after decades of research in branch prediction, branch predictors still remain imperfect, w...
Conditional branches are expensive. Branches require a significant percentage of execution cycles si...
Current processors exploit out-of-order execution and branch prediction to improve instruction level...
Abstract—Mobile and PC/server class processor companies continue to roll out flagship core microarch...
Abstract—Data-parallel architectures must provide efficient support for complex control-flow constru...
High performance microprocessors have relied on accurate branch predictors to maintain high instruct...
Due to limits in technology scaling, energy efficiency of logic devices is decreasing in successive...
Pipeline stalls due to branches represent one of the most significant impediments to realizing the p...
Irregular control-flow structures like deeply nested conditional branches are common in real-world s...
This paper presents the concept of dynamic control independence (DCI) and shows how it can be detect...
Our goal is to dramatically increase the performance of uniprocessors through the exploitation of in...
Control divergence poses many problems in parallelizing loops. While predicated execution is commonl...
This article describes a technique for path unfolding for conditional branches in parallel programs ...