This paper presents a partitioning and allocation algorithm for an iterative stream compiler, targeting heterogeneous multiprocessors with constrained distributed memory and any communications topology. We introduce a novel definition of connectedness that enables the algorithm to model the capabilities of the compiler. The algorithm uses convexity and connectedness constraints to produce partitions that are easier to compile and require short pipelines. Software pipelining is an effective transformation, but it increases memory footprint and latency, and has a startup overhead. Our algorithm takes account of these downstream costs. We show results for the StreamIt 2.1.1 benchmarks for an SMP, 2 × 2 mesh, SMP plus accelerator, and IBM QS20 ...
Energy efficient embedded computing enables new application scenarios in mobile devices like softwar...
Abstract With the increasing miniaturization of transistors, wire delays are becoming a dominant fac...
Over the past two decades, microprocessor manufacturers have typically relied on wider issue widths ...
Stream based languages are a popular approach to expressing parallelism in modern applications. The ...
Stream programming offers a portable way for regular applications such as digital video, software ra...
Stream programming is a promising way to expose concurrency to the compiler. A stream program is bui...
Heterogeneous processing systems have become the industry standard in almost every segment of the co...
This thesis considers how to exploit the specific characteristics of data streaming functions and mu...
Embedded streaming applications are facing increasingly demanding performance requirements in terms ...
The StreamIt programming model has been proposed to exploit parallelism in streaming applications ...
Given the ubiquity of multicore processors, there is an acute need to enable the development of scal...
With the increasing miniaturization of transistors, wire delays are becoming a dominant factor in mi...
This paper describes a compiler for stream programs that efficiently schedules computational kernels...
We address programming of accelerator-based heterogeneous multiprocessors in the context of computat...
Kelly W, Flasskamp M, Sievers G, et al. A Communication Model and Partitioning Algorithm for Streami...
Energy efficient embedded computing enables new application scenarios in mobile devices like softwar...
Abstract With the increasing miniaturization of transistors, wire delays are becoming a dominant fac...
Over the past two decades, microprocessor manufacturers have typically relied on wider issue widths ...
Stream based languages are a popular approach to expressing parallelism in modern applications. The ...
Stream programming offers a portable way for regular applications such as digital video, software ra...
Stream programming is a promising way to expose concurrency to the compiler. A stream program is bui...
Heterogeneous processing systems have become the industry standard in almost every segment of the co...
This thesis considers how to exploit the specific characteristics of data streaming functions and mu...
Embedded streaming applications are facing increasingly demanding performance requirements in terms ...
The StreamIt programming model has been proposed to exploit parallelism in streaming applications ...
Given the ubiquity of multicore processors, there is an acute need to enable the development of scal...
With the increasing miniaturization of transistors, wire delays are becoming a dominant factor in mi...
This paper describes a compiler for stream programs that efficiently schedules computational kernels...
We address programming of accelerator-based heterogeneous multiprocessors in the context of computat...
Kelly W, Flasskamp M, Sievers G, et al. A Communication Model and Partitioning Algorithm for Streami...
Energy efficient embedded computing enables new application scenarios in mobile devices like softwar...
Abstract With the increasing miniaturization of transistors, wire delays are becoming a dominant fac...
Over the past two decades, microprocessor manufacturers have typically relied on wider issue widths ...