Irregular memory access pattern in non-stencil kernel computing renders the well-known hyperplane-[1], lattice-[2] , or tessellationbased [3] HLS techniques ineffective. We develop an elegant yet effective technique that synthesizes memory-optimal architecture from high level software code in order to maximize applicationspecific data parallelism. Our basic idea is to exploit graph structures embedded in data access pattern and computation structure in order to perform the memory banking that maximizes parallel memory accesses while conserving both hardware and energy consumption. Specifically, we priority color a weighted conflict graph generated from folding the fundamental conflict graph to maximize memory conflict reduction. Most intere...
The availability and utility of large numbers of Graphical Processing Units (GPUs) have enabled para...
Finding minimal cuts on graphs with a grid-like struc-ture has become a core task for solving many c...
There has been significant recent interest in parallel graph processing due to the need to quickly a...
High-Level Synthesis (HLS) has advanced significantly in compiling high-level “soft” programs into e...
The explosion of digital data and the ever-growing need for fast data analysis have made in-memory b...
The stagnant performance of single core processors, increasing size of data sets, and variety of str...
Semi-parallel, or folded, VLSI architectures are used whenever hardware resources need to be saved. ...
Stencil-based kernels constitute the core of many scientific applications on block-structured grids....
Mechanisms for improving the execution efficiency of graph algorithms on Data-Parallel Architectures...
Graph algorithms typically have very low computational intensities, hence their execution times are ...
Although modern supercomputers are composed of multicore machines, one can find scientists that stil...
Abstract. This paper proposes tiling techniques based on data depen-dencies and not in code structur...
In computer science, dependence analysis determines whether or not it is safe to parallelize stateme...
Future High Performance Computing (HPC) nodes will have many more processors than the contemporary a...
We study conflict-free data distribution schemes in parallel memories in multiprocessor system archi...
The availability and utility of large numbers of Graphical Processing Units (GPUs) have enabled para...
Finding minimal cuts on graphs with a grid-like struc-ture has become a core task for solving many c...
There has been significant recent interest in parallel graph processing due to the need to quickly a...
High-Level Synthesis (HLS) has advanced significantly in compiling high-level “soft” programs into e...
The explosion of digital data and the ever-growing need for fast data analysis have made in-memory b...
The stagnant performance of single core processors, increasing size of data sets, and variety of str...
Semi-parallel, or folded, VLSI architectures are used whenever hardware resources need to be saved. ...
Stencil-based kernels constitute the core of many scientific applications on block-structured grids....
Mechanisms for improving the execution efficiency of graph algorithms on Data-Parallel Architectures...
Graph algorithms typically have very low computational intensities, hence their execution times are ...
Although modern supercomputers are composed of multicore machines, one can find scientists that stil...
Abstract. This paper proposes tiling techniques based on data depen-dencies and not in code structur...
In computer science, dependence analysis determines whether or not it is safe to parallelize stateme...
Future High Performance Computing (HPC) nodes will have many more processors than the contemporary a...
We study conflict-free data distribution schemes in parallel memories in multiprocessor system archi...
The availability and utility of large numbers of Graphical Processing Units (GPUs) have enabled para...
Finding minimal cuts on graphs with a grid-like struc-ture has become a core task for solving many c...
There has been significant recent interest in parallel graph processing due to the need to quickly a...