25th International Conference on Parallel and Distributed Computing, Göttingen, Germany, August 26-30, 2019 ProceedingsInternational audienceThis paper demonstrates how OpenMP 4.5 tasks can be used to efficiently overlap computations and MPI communications based on a case-study conducted on multi-core and many-core architectures. It focuses on task granularity, dependencies and priorities, and also identifies some limitations of OpenMP. Results on 64 Skylake nodes show that while 64% of the wall-clock time is spent in MPI communications, 60% of the cores are busy in computations, which is a good result. Indeed, the chosen dataset is small enough to be a challenging case in terms ofoverlap and thus useful to assess worst-case scenarios in fu...