AbstractÐIn many real applications, for example, those with frequent and irregular communication patterns or those using large messages, network contention and contention for message processing resources can be a significant part of the total execution time. This paper presents a new cost model, called LoGPC, that extends the LogP [9] and LogGP [4] models to account for the impact of network contention and network interface DMA behavior on the performance of message passing programs. We validate LoGPC by analyzing three applications implemented with Active Messages [11], [19] on the MIT Alewife multiprocessor. Our analysis shows that network contention accounts for up to 50 percent of the total execution time. In addition, we show that the ...
In this paper, we adapt Gustafson-Barsis' law to evaluate the effect of communication on the pe...
Data-parallel applications executing in clustered environments share resources with other applicatio...
Accurate models of parallel computation are often crucial to optimize parallel algorithms for their ...
We present a new model of parallel computation---the LogGP model---and use it to analyze a number of...
Abstract—Many existing models of point-to-point communication in distributed systems ignore the impa...
Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer S...
Most applications share the resources of networked workstations with other applications. Since syste...
Abstract. Performance modeling is important for implementing efficient parallel applications and run...
International audienceMulti-core clusters are cost-effective clusters largely used in high-performan...
International audienceTo amortize the cost of MPI communications, distributed parallel HPC applicati...
The goal of this paper is to gain insight into the relative performance of communication mechanisms ...
Moving data between processes has often been discussed as one of the major bottlenecks in parallel c...
Network performance measurement and prediction is very important to predict the running time of high...
A quantitative comparison of the BSP and LogP models of parallel computation is developed. We concen...
International audienceTo amortize the cost of MPI communications, distributed parallel HPC applicati...
In this paper, we adapt Gustafson-Barsis' law to evaluate the effect of communication on the pe...
Data-parallel applications executing in clustered environments share resources with other applicatio...
Accurate models of parallel computation are often crucial to optimize parallel algorithms for their ...
We present a new model of parallel computation---the LogGP model---and use it to analyze a number of...
Abstract—Many existing models of point-to-point communication in distributed systems ignore the impa...
Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer S...
Most applications share the resources of networked workstations with other applications. Since syste...
Abstract. Performance modeling is important for implementing efficient parallel applications and run...
International audienceMulti-core clusters are cost-effective clusters largely used in high-performan...
International audienceTo amortize the cost of MPI communications, distributed parallel HPC applicati...
The goal of this paper is to gain insight into the relative performance of communication mechanisms ...
Moving data between processes has often been discussed as one of the major bottlenecks in parallel c...
Network performance measurement and prediction is very important to predict the running time of high...
A quantitative comparison of the BSP and LogP models of parallel computation is developed. We concen...
International audienceTo amortize the cost of MPI communications, distributed parallel HPC applicati...
In this paper, we adapt Gustafson-Barsis' law to evaluate the effect of communication on the pe...
Data-parallel applications executing in clustered environments share resources with other applicatio...
Accurate models of parallel computation are often crucial to optimize parallel algorithms for their ...