International audienceResearch interest and industry investment in edge computing solutions have increased dramatically in recent years. Consequent quest for balanced performance, energy efficiency and flexibility bestowed surging popularity on Coarse Grained Reconfigurable Array (CGRA) architectures. To further improve the performance and energy efficiency, several hardware and software-based loop optimizations are adopted for CGRAs. In this paper, we propose a centralized hardware-based loop optimization technique to achieve better area and energy results compared to the previously implemented distributed version. Without incurring any performance degradation, area overhead against the reference architecture is reduced down to 1.5% for a ...