Tree structures are one of the most pervasive data structures. Many tree-based applications feature abundant amount of data level parallelism (DLP), and modern SIMD architectures, such as GPUs and the SIMD instruction extension to CPUs, appear well-suited. However, due to the high irregularity in control flow and data accesses of these applications, exploiting DLP on SIMD architectures is difficult and transforming the SIMD resources into real performance is challenging. This thesis focuses on two such irregular applications: intensive tree search and repeated tree traversals, and proposes two corresponding approaches on CPUs and GPUs. The first proposed approach is Poker, a permutation-based SIMD approach for vectorizing intensive search...
Increasing single instruction multiple data (SIMD) capabilities in modern hardware allows for the co...
Irregular applications have frequent data-dependent memory accesses and control flow. They arise in ...
International audienceIn this paper, we address the design and implementation of GPU-accelerated Bra...
Many domains in computer science, from data-mining to graphics to computational astrophysics, focus ...
With the advent of programmer-friendly GPU computing environments, there has been much interest in o...
In this paper, we accelerate the processing of tree-based in-dex structures by using SIMD instructio...
textGraphics Processing Units (GPUs) have become a popular platform for executing general purpose (i...
textRecent graphics processing units (GPUs) have emerged as a promising platform for general purpose...
AbstractThe use of GPUs has enabled us to achieve substantial acceleration in highly regular data pa...
This paper presents a new technique for introducing and tuning parallelism for heterogeneous shared-...
The single core processor, which has dominated for over 30 years, is now obsolete with recent trends...
The set of tree-recursive algorithms is large, including constraint satisfaction using back-tracking...
Funding: This work was supported by the EU Horizon 2020 project, TeamPlay, Grant Number 779882, and ...
Increasing single instruction multiple data (SIMD) capabilities in modern hardware allows for compil...
This paper presents a new technique for introducing and tuning parallelism for heterogeneous shared-...
Increasing single instruction multiple data (SIMD) capabilities in modern hardware allows for the co...
Irregular applications have frequent data-dependent memory accesses and control flow. They arise in ...
International audienceIn this paper, we address the design and implementation of GPU-accelerated Bra...
Many domains in computer science, from data-mining to graphics to computational astrophysics, focus ...
With the advent of programmer-friendly GPU computing environments, there has been much interest in o...
In this paper, we accelerate the processing of tree-based in-dex structures by using SIMD instructio...
textGraphics Processing Units (GPUs) have become a popular platform for executing general purpose (i...
textRecent graphics processing units (GPUs) have emerged as a promising platform for general purpose...
AbstractThe use of GPUs has enabled us to achieve substantial acceleration in highly regular data pa...
This paper presents a new technique for introducing and tuning parallelism for heterogeneous shared-...
The single core processor, which has dominated for over 30 years, is now obsolete with recent trends...
The set of tree-recursive algorithms is large, including constraint satisfaction using back-tracking...
Funding: This work was supported by the EU Horizon 2020 project, TeamPlay, Grant Number 779882, and ...
Increasing single instruction multiple data (SIMD) capabilities in modern hardware allows for compil...
This paper presents a new technique for introducing and tuning parallelism for heterogeneous shared-...
Increasing single instruction multiple data (SIMD) capabilities in modern hardware allows for the co...
Irregular applications have frequent data-dependent memory accesses and control flow. They arise in ...
International audienceIn this paper, we address the design and implementation of GPU-accelerated Bra...