Abstract—Many scientific applications are organized in a data parallel way: as sequences of parallel and/or reduction loops. This exposes parallelism well, but does not convert data reuse between loops into data locality. This paper focuses on this issue in parallel loops whose loop-to-loop dependence structure is data-dependent due to indirect references such as A[B[i]]. Such references are a common occurrence in sparse matrix computations, molecu-lar dynamics simulations, and unstructured-mesh computational fluid dynamics (CFD). Previously, sparse tiling approaches were developed for individual benchmarks to group iterations across such loops to improve data locality. These approaches were shown to benefit applications such as moldyn, Gau...
Many computationally-intensive programs, such as those for differential equations, spatial interpola...
This paper presents a combined compile-time and runtime loop-carried dependence analysis of sparse m...
Sparse system solvers and general purpose codes for solving partial differential equations are examp...
Abstract—Increasingly, the main bottleneck limiting performance on emerging multi-core and many-core...
Abstract—Unstructured meshes are widely-used in scientific computing for implementing numerical meth...
Finite Element problems are often solved using multigrid techniques. The most time consuming part of...
The key common bottleneck in most stencil codes is data movement, and prior research has shown that ...
The Polyhedral model has proven to be a valuable tool for improving memory locality and exploiting p...
Tiling is a technique used for exploiting medium-grain parallelism in nested loops. It relies on a f...
Abstract—Parallelization and locality optimization of affine loop nests has been successfully addres...
Publication rights licensed to ACM. Sparse tiling is a technique to fuse loops that access common da...
Existing techniques can enhance the locality of arrays indexed by affine functions of induction vari...
Typical parallelization approaches such as OpenMP and CUDA provide constructs for parallelizing and ...
Run-time compilation techniques have been shown effective for automating the parallelization of loop...
This paper presents a compiler and runtime framework for parallelizing sparse matrix computations th...
Many computationally-intensive programs, such as those for differential equations, spatial interpola...
This paper presents a combined compile-time and runtime loop-carried dependence analysis of sparse m...
Sparse system solvers and general purpose codes for solving partial differential equations are examp...
Abstract—Increasingly, the main bottleneck limiting performance on emerging multi-core and many-core...
Abstract—Unstructured meshes are widely-used in scientific computing for implementing numerical meth...
Finite Element problems are often solved using multigrid techniques. The most time consuming part of...
The key common bottleneck in most stencil codes is data movement, and prior research has shown that ...
The Polyhedral model has proven to be a valuable tool for improving memory locality and exploiting p...
Tiling is a technique used for exploiting medium-grain parallelism in nested loops. It relies on a f...
Abstract—Parallelization and locality optimization of affine loop nests has been successfully addres...
Publication rights licensed to ACM. Sparse tiling is a technique to fuse loops that access common da...
Existing techniques can enhance the locality of arrays indexed by affine functions of induction vari...
Typical parallelization approaches such as OpenMP and CUDA provide constructs for parallelizing and ...
Run-time compilation techniques have been shown effective for automating the parallelization of loop...
This paper presents a compiler and runtime framework for parallelizing sparse matrix computations th...
Many computationally-intensive programs, such as those for differential equations, spatial interpola...
This paper presents a combined compile-time and runtime loop-carried dependence analysis of sparse m...
Sparse system solvers and general purpose codes for solving partial differential equations are examp...