The amount of data generated by computing clusters is very large, including nodes resources data or application related data, among others. However, current systems do not exploit all the potential that this data can offer. This thesis attempts to put into use cluster telemetry data for two different purposes, scheduling and workload estimation. Motivated by the latest advancements in the machine learning field, a Deep Reinforcement Learning (DRL) based scheduler is proposed. Two different scheduling experiments are performed in a simulated cluster environment. The results show that the DRL based scheduler can be trained in specific cluster architectures to optimize performance parameters, such as, job completion time, hence, obtaining the ...
Using neural networks to find optimal solutions to real-time scheduling is a common technique, and t...
Every day most people are using applications and services that are utilising machine learning, in so...
Millions of battery-powered sensors deployed for monitoring purposes in a multitude of scenarios, e....
The amount of data generated by computing clusters is very large, including nodes resources data or ...
Resource usage of production workloads running on shared compute clusters often fluctuate significan...
The aim of this paper is to provide a description of machine learning based scheduling approach for ...
With more businesses are running online, the scale of data centers is increasing dramatically. The t...
The ability to manage the distributed functionality of large multi-vendor networks will be an import...
Stemming from the growth and increased complexity of computer vision, natural language processing, a...
As High Performance Computing (HPC) has grown considerably and is expected to grow even more, effect...
Abstract In recent years, the rapid development of artificial intelligence and data science has give...
With widespread advances in machine learning, a number of large enterprises are beginning to incorpo...
Deep neural networks (DNNs) have recently yielded strong results on a range of applications. Trainin...
Attempts to address the production scheduling problem thus far rely on simplifying assumptions, such...
A machine learning job comprises a variety of resource-intensive tasks.The loads of executing such t...
Using neural networks to find optimal solutions to real-time scheduling is a common technique, and t...
Every day most people are using applications and services that are utilising machine learning, in so...
Millions of battery-powered sensors deployed for monitoring purposes in a multitude of scenarios, e....
The amount of data generated by computing clusters is very large, including nodes resources data or ...
Resource usage of production workloads running on shared compute clusters often fluctuate significan...
The aim of this paper is to provide a description of machine learning based scheduling approach for ...
With more businesses are running online, the scale of data centers is increasing dramatically. The t...
The ability to manage the distributed functionality of large multi-vendor networks will be an import...
Stemming from the growth and increased complexity of computer vision, natural language processing, a...
As High Performance Computing (HPC) has grown considerably and is expected to grow even more, effect...
Abstract In recent years, the rapid development of artificial intelligence and data science has give...
With widespread advances in machine learning, a number of large enterprises are beginning to incorpo...
Deep neural networks (DNNs) have recently yielded strong results on a range of applications. Trainin...
Attempts to address the production scheduling problem thus far rely on simplifying assumptions, such...
A machine learning job comprises a variety of resource-intensive tasks.The loads of executing such t...
Using neural networks to find optimal solutions to real-time scheduling is a common technique, and t...
Every day most people are using applications and services that are utilising machine learning, in so...
Millions of battery-powered sensors deployed for monitoring purposes in a multitude of scenarios, e....