This paper develops a plug-and-play reusable LogGP model that can be used to predict the runtime and scaling behavior of different MPI-based pipelined wavefront applications running on modern parallel platforms with multicore nodes. A key new feature of the model is that it requires only a few simple input parameters to project performance for wavefront codes with different structure to the sweeps in each iteration as well as different behavior during each wavefront computation and/or between iterations. We apply the model to three key benchmark applications that are used in high performance computing procurement, illustrating that the model parameters yield insight into the key differences among the codes. We also develop new, simple and h...
The main objective of the MPI communication library is to enable portable parallel programming with ...
In this paper we investigate the use of distributed graphics processing unit (GPU)-based architectur...
Prediction of the performance of parallel applications is a concept useful in several domains of sof...
This paper develops a plug-and-play reusable LogGP model that can be used to predict the runtime and...
This paper develops a highly accurate LogGP model of a complex wavefront application that uses MPI c...
This paper develops a highly accurate LogGP model of a complex wavefront application that uses MPI c...
The authors introduced a performance model for parallel, multidimensional, wavefront calculations wi...
The authors develop a model for the parallel performance of algorithms that consist of concurrent, t...
The authors develop a model for the parallel performance of algorithms that consist of concurrent, t...
Pipelined wavefront computations are a ubiquitous class of parallel algorithm used for the solution ...
Pipelined wavefront computations are an ubiquitous class of high performance parallel algorithms use...
The authors develop a model for the parallel performance of algorithms that consist of concurrent, t...
This paper details the development and application of a model for predictive performance analysis of...
In this paper we investigate the use of distributed graphics processing unit (GPU)-based architectur...
Chandrasekaran, SunitaProcessor architectures have been rapidly evolving for decades. From the intro...
The main objective of the MPI communication library is to enable portable parallel programming with ...
In this paper we investigate the use of distributed graphics processing unit (GPU)-based architectur...
Prediction of the performance of parallel applications is a concept useful in several domains of sof...
This paper develops a plug-and-play reusable LogGP model that can be used to predict the runtime and...
This paper develops a highly accurate LogGP model of a complex wavefront application that uses MPI c...
This paper develops a highly accurate LogGP model of a complex wavefront application that uses MPI c...
The authors introduced a performance model for parallel, multidimensional, wavefront calculations wi...
The authors develop a model for the parallel performance of algorithms that consist of concurrent, t...
The authors develop a model for the parallel performance of algorithms that consist of concurrent, t...
Pipelined wavefront computations are a ubiquitous class of parallel algorithm used for the solution ...
Pipelined wavefront computations are an ubiquitous class of high performance parallel algorithms use...
The authors develop a model for the parallel performance of algorithms that consist of concurrent, t...
This paper details the development and application of a model for predictive performance analysis of...
In this paper we investigate the use of distributed graphics processing unit (GPU)-based architectur...
Chandrasekaran, SunitaProcessor architectures have been rapidly evolving for decades. From the intro...
The main objective of the MPI communication library is to enable portable parallel programming with ...
In this paper we investigate the use of distributed graphics processing unit (GPU)-based architectur...
Prediction of the performance of parallel applications is a concept useful in several domains of sof...