Size-based scheduling with aging has been recognized as an effective approach to guarantee fairness and near-optimal system response times. We present HFSP, a scheduler introducing this technique to a real, multi-server, complex, and widely used system such as Hadoop. Size-based scheduling requires a priori job size information, which is not available in Hadoop: HFSP builds such knowledge by estimating it on-line during job execution. Our experiments, which are based on realistic workloads generated via a standard benchmarking suite, pinpoint at a significant decrease in system response times with respect to the widely used Hadoop Fair scheduler, without impacting the fairness of the scheduler, and show that HFSP is largely tolerant to job ...
International audienceHadoop has been recently used to process a diverse variety of applications, sh...
The exponential growth of collected data poses the challenge of efficient data processing among othe...
Job scheduling in high-performance computing platforms is a hard problem that involves uncertainties...
Size-based scheduling with aging has been recognized as an effective approach to guarantee fairness ...
Abstract—Size-based scheduling with aging has, for long, been recognized as an effective approach to...
Size-based scheduling with aging has, for long, been recognized as an effective approach to guarante...
The past decade have seen the rise of data-intensive scalable computing (DISC) systems, such as Hado...
Size-based schedulers have very desirable performance properties: optimal or near-optimal response t...
Abstract—Size-based schedulers have very desirable performance properties: optimal or near-optimal r...
We study size-based schedulers, and focus on the impact of inaccurate job size information on respon...
Cloud computing is a power platform to deal with big data. Among several software frameworks used fo...
La dernière décennie a vu l’émergence de systèmes parallèles pour l’analyse de grosse quantités de d...
The majority of large-scale data severe applications executed by data centers are based on MapReduce...
Hadoop is a free, Java-based programming system that backings the preparing of vast informational co...
At present, big data is very popular, because it has proved to be much successful in many fields suc...
International audienceHadoop has been recently used to process a diverse variety of applications, sh...
The exponential growth of collected data poses the challenge of efficient data processing among othe...
Job scheduling in high-performance computing platforms is a hard problem that involves uncertainties...
Size-based scheduling with aging has been recognized as an effective approach to guarantee fairness ...
Abstract—Size-based scheduling with aging has, for long, been recognized as an effective approach to...
Size-based scheduling with aging has, for long, been recognized as an effective approach to guarante...
The past decade have seen the rise of data-intensive scalable computing (DISC) systems, such as Hado...
Size-based schedulers have very desirable performance properties: optimal or near-optimal response t...
Abstract—Size-based schedulers have very desirable performance properties: optimal or near-optimal r...
We study size-based schedulers, and focus on the impact of inaccurate job size information on respon...
Cloud computing is a power platform to deal with big data. Among several software frameworks used fo...
La dernière décennie a vu l’émergence de systèmes parallèles pour l’analyse de grosse quantités de d...
The majority of large-scale data severe applications executed by data centers are based on MapReduce...
Hadoop is a free, Java-based programming system that backings the preparing of vast informational co...
At present, big data is very popular, because it has proved to be much successful in many fields suc...
International audienceHadoop has been recently used to process a diverse variety of applications, sh...
The exponential growth of collected data poses the challenge of efficient data processing among othe...
Job scheduling in high-performance computing platforms is a hard problem that involves uncertainties...