International audienceThe ever growing complexity of high performance computing systems imposes significant challenges to exploit as much as possible their computational and memory resources. Recently, the Cache-aware Roofline Model has gained popularity due to its simplicity when modeling multi-cores with complex memory hierarchy, characterizing applications bottlenecks, and quantifying achieved or remaining improvements. In this short paper we involve hardware locality topology detection to build the Cache Aware Roofline Model for modern processors in an open-source locality-aware tool. The proposed tool also includes a set of specific micro-benchmarks to assess the micro-architecture performance upper-bounds. The experimental results sho...
technical reportProcessor speeds continue to increase at faster rates than memory speeds. As this pe...
This research is part of a co-design project that has the goal of designing hardware systems to matc...
International audienceModern computing platforms are increasingly complex, with multiple cores, shar...
Proceedings of: Third International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2016...
International audienceIn order to fulfill modern applications needs, computing systems become more p...
With energy-efficient architectures, including accelerators and many-core processors, gaining tracti...
International audienceThe roofline model is a popular approach to ``bounds and bottleneck''performan...
International audienceThe increasing complexity of parallel computing platforms requires a deep know...
The end of Dennard scaling signaled a shift in HPC supercomputer architectures from systems built fr...
Through years, the complexity of High Performance Computing (HPC) systems’ memory hierarchy has incr...
Parallel computing platforms are increasingly complex, with multiple cores, shared caches, and NUMA ...
HPC applications usually run at a low fraction of the computer's peak performance. Empirical perform...
International audienceThe increasing computation capability of servers comes with a dramatic increas...
International audienceHigh-performance computing requires a deep knowledge of the hardware platform ...
This research is part of a co-design project that has the goal of designing hardware syste...
technical reportProcessor speeds continue to increase at faster rates than memory speeds. As this pe...
This research is part of a co-design project that has the goal of designing hardware systems to matc...
International audienceModern computing platforms are increasingly complex, with multiple cores, shar...
Proceedings of: Third International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2016...
International audienceIn order to fulfill modern applications needs, computing systems become more p...
With energy-efficient architectures, including accelerators and many-core processors, gaining tracti...
International audienceThe roofline model is a popular approach to ``bounds and bottleneck''performan...
International audienceThe increasing complexity of parallel computing platforms requires a deep know...
The end of Dennard scaling signaled a shift in HPC supercomputer architectures from systems built fr...
Through years, the complexity of High Performance Computing (HPC) systems’ memory hierarchy has incr...
Parallel computing platforms are increasingly complex, with multiple cores, shared caches, and NUMA ...
HPC applications usually run at a low fraction of the computer's peak performance. Empirical perform...
International audienceThe increasing computation capability of servers comes with a dramatic increas...
International audienceHigh-performance computing requires a deep knowledge of the hardware platform ...
This research is part of a co-design project that has the goal of designing hardware syste...
technical reportProcessor speeds continue to increase at faster rates than memory speeds. As this pe...
This research is part of a co-design project that has the goal of designing hardware systems to matc...
International audienceModern computing platforms are increasingly complex, with multiple cores, shar...