Load imbalance is a long-standing source of inefficiency in high performance computing. The situation has only got worse as applications and systems increase in complexity, e.g., adaptive mesh refinement, DVFS, memory hierarchies, power and thermal management, and manufacturing processes. Load balancing is often implemented in the application, but it obscures application logic and may need extensive code refactoring. This paper presents an automated and transparent dynamic load balancing approach for MPI applications with OmpSs-2 tasks, which relieves applications from this burden. Only local and trivial changes are required to the application. Our approach exploits the ability of OmpSs-2@Cluster to offload tasks for execution on other...
Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016)...
Load imbalance problem is one of the major obstacles to achieving optimal performance of High Perfor...
The largest supercomputers have millions of independent processors, and concurrency levels are rapid...
The hybrid programming model MPI+OpenMP are useful to solve the problems of load balancing of parall...
It is well known that load imbalance is a major source of efficiency loss in HPC (High Performance C...
Balancing the workload of sophisticated simulations is inherently difficult, since we have to balanc...
State-of-the-art programming approaches generally have a strict division between intra-node shared m...
In this paper we introduce a methodology for dynamic job reconfiguration driven by the programming m...
The DLB (Dynamic Load Balancing) library and LeWl (LEnd When Idle) algorithm provide a runtime solut...
Power consumption is a very important issue for HPC community, both at the level of one application ...
Adaptive workloads can change on–the–fly the configuration of their jobs, in terms of number of pro...
©2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for al...
In this paper we describe the design of fault tolerance capabilities for general-purpose offload sem...
This research demonstrates that the automatic implementation of a dynamic load balancing (DLB) strat...
This paper presents the evolution of the free agent threads for OpenMP to the new role-shifting thre...
Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016)...
Load imbalance problem is one of the major obstacles to achieving optimal performance of High Perfor...
The largest supercomputers have millions of independent processors, and concurrency levels are rapid...
The hybrid programming model MPI+OpenMP are useful to solve the problems of load balancing of parall...
It is well known that load imbalance is a major source of efficiency loss in HPC (High Performance C...
Balancing the workload of sophisticated simulations is inherently difficult, since we have to balanc...
State-of-the-art programming approaches generally have a strict division between intra-node shared m...
In this paper we introduce a methodology for dynamic job reconfiguration driven by the programming m...
The DLB (Dynamic Load Balancing) library and LeWl (LEnd When Idle) algorithm provide a runtime solut...
Power consumption is a very important issue for HPC community, both at the level of one application ...
Adaptive workloads can change on–the–fly the configuration of their jobs, in terms of number of pro...
©2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for al...
In this paper we describe the design of fault tolerance capabilities for general-purpose offload sem...
This research demonstrates that the automatic implementation of a dynamic load balancing (DLB) strat...
This paper presents the evolution of the free agent threads for OpenMP to the new role-shifting thre...
Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016)...
Load imbalance problem is one of the major obstacles to achieving optimal performance of High Perfor...
The largest supercomputers have millions of independent processors, and concurrency levels are rapid...