The ever increasing number of processing units integrated on the same many-core chip delivers computational power that can exceed the performance requirements of a single application. The number of chips (and related power consumption) can thus be reduced to serve multiple applications — a practice which is called resource consolidation. However, this solution requires techniques to partition and assign resources among the applications and to manage unpredictable dynamic workloads. To provide the performance requirements in such scenarios, we exploit application auto-tuning, based on design-time analysis, of both application-specific dynamic knobs and computational parallelism. Such features are implemented in a software library, which is ...