Abstract. State-of-the-art behavioral synthesis tools for reconfigurable architectures barely have high-level transformations in order to achieve highly parallelized implementations. If any, they apply loop unrolling to obtain a higher throughput. In this paper, we use the PARO behavioral synthesis tool which has the unique ability to perform both loop unrolling or loop partitioning. Loop unrolling replicates the loop kernel and exposes the parallelism for hardware implementation. Whereas, partitioning tiles the loop program onto a regular array consisting of tightly coupled processing elements. The usage of the same design tool for both the variants enables for the first time, a quantitative evaluation of the two approaches with help of se...
Runtime reconfiguration provides an efficient means to reduce the hardware cost, while satisfying th...
Developing efficient programs for many of the current parallel computers is not easy due to the arch...
Many sequential applications are difficult to parallelize because of unpredictable control flow, ind...
This paper considers the role of performance and area esti-mates from behavioral synthesis in design...
Due to the rapidly increasing complexity in hardware designs and competitive time to market trends i...
Application specific MPSoCs are often used to implement high-performance data-intensive applications...
Behavioral synthesis tools have made significant progress in compiling high-level programs into regi...
. Reconfigurable circuits and systems have evolved from application specific accelerators to a gener...
Control divergence poses many problems in parallelizing loops. While predicated execution is commonl...
This thesis is about the design and implementation of a program transformation technique for paralle...
Loops are an important source of optimization. In this paper, we propose a new technique for optimiz...
This paper presents an approach to behavioral synthesis for loop-based BIST. By taking into account ...
Apart from academic, recently more and more commercial coarse-grained reconfigurable arrays have bee...
Loop pipelining is widely adopted as a key optimization method in high-level synthesis (HLS). Howeve...
Parallel processing has been used to increase performance of computing systems for the past several ...
Runtime reconfiguration provides an efficient means to reduce the hardware cost, while satisfying th...
Developing efficient programs for many of the current parallel computers is not easy due to the arch...
Many sequential applications are difficult to parallelize because of unpredictable control flow, ind...
This paper considers the role of performance and area esti-mates from behavioral synthesis in design...
Due to the rapidly increasing complexity in hardware designs and competitive time to market trends i...
Application specific MPSoCs are often used to implement high-performance data-intensive applications...
Behavioral synthesis tools have made significant progress in compiling high-level programs into regi...
. Reconfigurable circuits and systems have evolved from application specific accelerators to a gener...
Control divergence poses many problems in parallelizing loops. While predicated execution is commonl...
This thesis is about the design and implementation of a program transformation technique for paralle...
Loops are an important source of optimization. In this paper, we propose a new technique for optimiz...
This paper presents an approach to behavioral synthesis for loop-based BIST. By taking into account ...
Apart from academic, recently more and more commercial coarse-grained reconfigurable arrays have bee...
Loop pipelining is widely adopted as a key optimization method in high-level synthesis (HLS). Howeve...
Parallel processing has been used to increase performance of computing systems for the past several ...
Runtime reconfiguration provides an efficient means to reduce the hardware cost, while satisfying th...
Developing efficient programs for many of the current parallel computers is not easy due to the arch...
Many sequential applications are difficult to parallelize because of unpredictable control flow, ind...