Abstract. This paper studies the influence that job placement may have on scheduling performance, in the context of massively parallel comput-ing systems. A simulation-based performance study is carried out, using workloads extracted from real systems logs. The starting point is a par-allel system built around a k-ary n-tree network and using well-known scheduling algorithms (FCFS and backfilling). We incorporate an alloca-tion policy that tries to assign to each job a contiguous network partition, in order to improve communication performance. This policy results in severe scheduling inefficiency due to increased system fragmentation. A relaxed version of it, which we call quasi-contiguous allocation, reduces this adverse effect. Experimen...
Scheduling of large-scale, distributed topology-aware applications requires that not only the proper...
Two strategies are used for the allocation of jobs to processors connected by mesh topologies: conti...
Two strategies are used for the allocation of jobs to processors connected by mesh topologies: conti...
Abstract—this paper studies the influence that task placement may have on the performance of applica...
scheduling In this paper, we utilize a bandwidth-centric job communication model that captures the i...
Torus-connected network is widely used in modern supercomputers due to its linear per node cost scal...
Contiguous allocation of parallel jobs usually suffers from the degrading effects of fragmentation a...
Abstract. The performance of contiguous allocation strategies can be significantly affected by the d...
Network interference of nearby jobs has been recently identified as the dominant reason for the high...
The performance of contiguous allocation strategies can be significantly affected by the distributio...
Abstract. Recent success in building petascale computing systems poses new challenges in job schedul...
Over the last decade, much research in the area of scheduling has concentrated on single cluster sys...
The performance of contiguous allocation strategies can be significantly affected by the type of the...
Abstract. The performance of contiguous allocation strategies can be significantly affected by the d...
International audienceWe present a joint scheduling and memory allocation algorithm for efficient ex...
Scheduling of large-scale, distributed topology-aware applications requires that not only the proper...
Two strategies are used for the allocation of jobs to processors connected by mesh topologies: conti...
Two strategies are used for the allocation of jobs to processors connected by mesh topologies: conti...
Abstract—this paper studies the influence that task placement may have on the performance of applica...
scheduling In this paper, we utilize a bandwidth-centric job communication model that captures the i...
Torus-connected network is widely used in modern supercomputers due to its linear per node cost scal...
Contiguous allocation of parallel jobs usually suffers from the degrading effects of fragmentation a...
Abstract. The performance of contiguous allocation strategies can be significantly affected by the d...
Network interference of nearby jobs has been recently identified as the dominant reason for the high...
The performance of contiguous allocation strategies can be significantly affected by the distributio...
Abstract. Recent success in building petascale computing systems poses new challenges in job schedul...
Over the last decade, much research in the area of scheduling has concentrated on single cluster sys...
The performance of contiguous allocation strategies can be significantly affected by the type of the...
Abstract. The performance of contiguous allocation strategies can be significantly affected by the d...
International audienceWe present a joint scheduling and memory allocation algorithm for efficient ex...
Scheduling of large-scale, distributed topology-aware applications requires that not only the proper...
Two strategies are used for the allocation of jobs to processors connected by mesh topologies: conti...
Two strategies are used for the allocation of jobs to processors connected by mesh topologies: conti...