Automation of Determination of Optimal Intra-Compute Node Parallelism

Gómez-Iglesias, Antonio
Brown, James C.

Open PDF

Open link

Publication date

January 2016

DOI

10.15781/T2FR2F

Language

English

Abstract

Maximizing the productivity of modern multicore and manycore chips requires optimizing parallelism at the compute node level. This is, however, a complex multi-step process. It is an iterative method requiring determining optimal degrees of parallel scalability and optimizing memory access behavior. Further, there are multiple cases to be considered, programs which use only MPI or OpenMP and hybrid (MPI +OpenMP) programs. This paper presents a set of three coordinated workﬂows for determining the optimal parallelism at the program level for MPI programs and at the loop level for hybrid (MPI+OpenMP) cases. The paper also details mostly automated implementations of these workﬂows using the PerfExpert infrastructure. Finally the paper presents...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Automation of Determination of Optimal Intra-Compute Node Parallelism

Abstract

Extracted data

Automation of Determination of Optimal Intra-Compute Node Parallelism

Abstract

Extracted data

Related items

Related items