Efficient Warp Execution in Presence of Divergence with Collaborative Context Collection

Farzad Khorasani
Rajiv Gupta
Laxmi N. Bhuyan

Publication date

January 2016

Abstract

GPU’s SIMD architecture is a double-edged sword con-fronting parallel tasks with control flow divergence. On the one hand, it provides a high performance yet power-efficient platform to accelerate applications via massive parallelism; however, on the other hand, irregularities induce inefficiencies due to the warp’s lockstep traver-sal of all diverging execution paths. In this work, we present a software (compiler) technique named Collab-orative Context Collection (CCC) that increases the warp execution efficiency when faced with thread diver-gence incurred either by different intra-warp task as-signment or by intra-warp load imbalance. CCC col-lects the relevant registers of divergent threads in a warp-specific stack allocated in the fast ...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Efficient Warp Execution in Presence of Divergence with Collaborative Context Collection

Abstract

Extracted data

Efficient Warp Execution in Presence of Divergence with Collaborative Context Collection

Abstract

Extracted data

Related items

Related items