Dataflow machines can "unravel" loops automatically so that many iterations of a loop can execute in parallel. Unbounded loop unraveling can strain the resources available on the machine and, in extreme cases, deadlock can occur due to overcommitment of resources. Previous efforts to address this problem have focused mainly on runtime mechanisms of debatable utility. Loop bounding, a compile-time technique, controls parallelism by introducing dependencies between loop iterations. The loop is given enough resources for the concurrent execution of some number of iterations, say $k$. The $K$ + 1st iteration uses the same resources as the first iteration and starts only after the first iteration is complete, and so on. Thus, the granul...
In statically scheduled multiprocessors inter-processor communication resources can be scheduled by ...
Traditionally, software pipelining is applied either to the innermost loop of a given loop nest or f...
Data-driven array architectures seem to be important alternatives for coarse-grained reconfigurable ...
Four scheduling strategies for dataflow graphs onto parallel processors are classified: (1) fully dy...
The term "dataflow" generally encompasses three distinct aspects of computation - a data-driven mode...
Scheduling data ow graphs onto processors consists of assigning actors to processors, ordering their...
High-Level Synthesis (HLS) tools generate hardware designs from high-level programming languages. Th...
Dynamically scheduled high-level synthesis (HLS) achieves higher throughput than static HLS for code...
We consider the resource-constrained scheduling of loops with inter-iteration dependencies. A loop i...
Discussed are how loop level parallelism is detected in a nonprocedural dataflow program, and how a ...
This paper proposes an efficient run-time system to schedule general nested loops on multiprocessors...
Loop scheduling is an important problem in parallel processing. The retiming technique reorganizes a...
Uncountable loops (such as while loops in C) and if-conditions are some of the most common construct...
A central task in high-level synthesis is scheduling: the allocation of operations to clock cycles. ...
The efficient implementation of parallel loops on distributed--memory multicomputers is a hot topic ...
In statically scheduled multiprocessors inter-processor communication resources can be scheduled by ...
Traditionally, software pipelining is applied either to the innermost loop of a given loop nest or f...
Data-driven array architectures seem to be important alternatives for coarse-grained reconfigurable ...
Four scheduling strategies for dataflow graphs onto parallel processors are classified: (1) fully dy...
The term "dataflow" generally encompasses three distinct aspects of computation - a data-driven mode...
Scheduling data ow graphs onto processors consists of assigning actors to processors, ordering their...
High-Level Synthesis (HLS) tools generate hardware designs from high-level programming languages. Th...
Dynamically scheduled high-level synthesis (HLS) achieves higher throughput than static HLS for code...
We consider the resource-constrained scheduling of loops with inter-iteration dependencies. A loop i...
Discussed are how loop level parallelism is detected in a nonprocedural dataflow program, and how a ...
This paper proposes an efficient run-time system to schedule general nested loops on multiprocessors...
Loop scheduling is an important problem in parallel processing. The retiming technique reorganizes a...
Uncountable loops (such as while loops in C) and if-conditions are some of the most common construct...
A central task in high-level synthesis is scheduling: the allocation of operations to clock cycles. ...
The efficient implementation of parallel loops on distributed--memory multicomputers is a hot topic ...
In statically scheduled multiprocessors inter-processor communication resources can be scheduled by ...
Traditionally, software pipelining is applied either to the innermost loop of a given loop nest or f...
Data-driven array architectures seem to be important alternatives for coarse-grained reconfigurable ...