Scheduling Recurring Distributed Dataflow Jobs Based on Resource Utilization and Interference

Thamsen, Lauritz
Rabier, Benjamin
Schmidt, Florian
Renner, Thomas
Kao, Odej

Open PDF

Open link

Publication date

January 2017

DOI

10.1109/BigDataCongress.2017.28

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Abstract

Resource management systems like YARN or Mesos enable users to share cluster infrastructures by running analytics jobs in temporarily reserved containers. These containers are typically not isolated to achieve high degrees of overall resource utilizations despite the often fluctuating resource usage of single analytic jobs. However, some combinations of jobs utilize the resources better and interfere less with each others when running on the same nodes than others. This paper presents an approach for improving the resource utilization and job throughput when scheduling recurring data analysis jobs in shared cluster environments. Using a reinforcement learning algorithm, the scheduler continuously learns which jobs are best executed simultan...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Scheduling Recurring Distributed Dataflow Jobs Based on Resource Utilization and Interference

Abstract

Extracted data

Scheduling Recurring Distributed Dataflow Jobs Based on Resource Utilization and Interference

Abstract

Extracted data

Related items

Related items