High Performance Computing is often performed on scarce and shared computing resources. To ensure computers are used to their full capacity, administrators often incentivize large workloads that are not possible on smaller systems. Measurements in Lattice QCD frequently do not scale to machine-size workloads. By bundling tasks together we can create large jobs suitable for gigantic partitions. We discuss METAQ and mpi_jm, software developed to dynamically group computational tasks together, that can intelligently backfill to consume idle time without substantial changes to users’ current workflows or executables
In today’s batch queue HPC cluster systems, the user submits a job requesting a fixed number of...
This work focuses on scheduling of MPI jobs when executing in shared-memory multiprocessors (SMPs). ...
The computing and communication resources of high performance computing systems are becoming heterog...
High Performance Computing is often performed on scarce and shared computing resources. To ensure co...
We describe a light-weight system of bash scripts for efficiently bundling supercomputing tasks into...
In this paper we introduce a methodology for dynamic job reconfiguration driven by the programming m...
Metacomputing is a convenient and powerful abstraction for dealing with the complexities that arise ...
Scheduling large amount of jobs/tasks over large-scale distributed systems play a significant role t...
Many-task computing is a well-established paradigm for implementing loosely coupled applications on ...
Metacomputing is a convenient and powerful abstraction for dealing with the complexities that arise ...
Abstract—Current software and hardware limitations prevent ManyTask Computing (MTC) from leveraging ...
Abstract. Recent success in building petascale computing systems poses new challenges in job schedul...
In recent years, HPC workloads and communities have undergone substantial paradigm shifts. There is ...
Efficiently scheduling application concurrency to system level resources is one of the main challeng...
A metacomputer is a set of machines networked together for increased computational performance. To b...
In today’s batch queue HPC cluster systems, the user submits a job requesting a fixed number of...
This work focuses on scheduling of MPI jobs when executing in shared-memory multiprocessors (SMPs). ...
The computing and communication resources of high performance computing systems are becoming heterog...
High Performance Computing is often performed on scarce and shared computing resources. To ensure co...
We describe a light-weight system of bash scripts for efficiently bundling supercomputing tasks into...
In this paper we introduce a methodology for dynamic job reconfiguration driven by the programming m...
Metacomputing is a convenient and powerful abstraction for dealing with the complexities that arise ...
Scheduling large amount of jobs/tasks over large-scale distributed systems play a significant role t...
Many-task computing is a well-established paradigm for implementing loosely coupled applications on ...
Metacomputing is a convenient and powerful abstraction for dealing with the complexities that arise ...
Abstract—Current software and hardware limitations prevent ManyTask Computing (MTC) from leveraging ...
Abstract. Recent success in building petascale computing systems poses new challenges in job schedul...
In recent years, HPC workloads and communities have undergone substantial paradigm shifts. There is ...
Efficiently scheduling application concurrency to system level resources is one of the main challeng...
A metacomputer is a set of machines networked together for increased computational performance. To b...
In today’s batch queue HPC cluster systems, the user submits a job requesting a fixed number of...
This work focuses on scheduling of MPI jobs when executing in shared-memory multiprocessors (SMPs). ...
The computing and communication resources of high performance computing systems are becoming heterog...