Production machine performance has large variability. On the UK National Supercomputing Service, the time a job takes to complete can vary by as much as 53%. Load imbalance and shared resource contention are largely responsible, but we find that previous efforts to model application/architecture performance do not typically take these into account. In this research we model and simulate network contention, which allows us to explore the impact of multiple interacting jobs and approaches to alleviate these effects, including network re-design and communication-staging within applications. We show the utility of this work on a variety of systems and interacting applications
International audienceMulti-core clusters are cost-effective clusters largely used in high-performan...
Reliably upperbounding contention in multicore shared resources is of prominent importance in the ea...
This paper presents a generalized model of tlghtly-coupled multlprocessor systems which is then simp...
Production machine performance has large variability. On the UK National Supercomputing Service, the...
Most applications share the resources of networked workstations with other applications. Since syste...
: Many research results in recent years have focused on the design of distributed shared memory (DSM...
As the field of High Performance Computing (HPC) approaches the Exascale era we see larger systems c...
Network interference of nearby jobs has been recently identified as the dominant reason for the high...
Data-parallel applications executing in clustered environments share resources with other applicatio...
Networks of workstations (NOWs) are becoming increas-ingly popular as a cost-effective alternative t...
In order to be able to develop robust and effective parallel applications and algorithms, one should...
Abstract—this paper studies the influence that task placement may have on the performance of applica...
Overlapping communication and computation allows both processors and network to be utilized concurre...
Shared cache contention can cause significant variabil-ity in the performance of co-running applicat...
A key obstacle to large-scale network simulation over PC clusters is the memory balancing problem wh...
International audienceMulti-core clusters are cost-effective clusters largely used in high-performan...
Reliably upperbounding contention in multicore shared resources is of prominent importance in the ea...
This paper presents a generalized model of tlghtly-coupled multlprocessor systems which is then simp...
Production machine performance has large variability. On the UK National Supercomputing Service, the...
Most applications share the resources of networked workstations with other applications. Since syste...
: Many research results in recent years have focused on the design of distributed shared memory (DSM...
As the field of High Performance Computing (HPC) approaches the Exascale era we see larger systems c...
Network interference of nearby jobs has been recently identified as the dominant reason for the high...
Data-parallel applications executing in clustered environments share resources with other applicatio...
Networks of workstations (NOWs) are becoming increas-ingly popular as a cost-effective alternative t...
In order to be able to develop robust and effective parallel applications and algorithms, one should...
Abstract—this paper studies the influence that task placement may have on the performance of applica...
Overlapping communication and computation allows both processors and network to be utilized concurre...
Shared cache contention can cause significant variabil-ity in the performance of co-running applicat...
A key obstacle to large-scale network simulation over PC clusters is the memory balancing problem wh...
International audienceMulti-core clusters are cost-effective clusters largely used in high-performan...
Reliably upperbounding contention in multicore shared resources is of prominent importance in the ea...
This paper presents a generalized model of tlghtly-coupled multlprocessor systems which is then simp...