In this paper, we present a size-based scheduling protocol for Hadoop, that caters to both interactivity and efficiency requirements raised by current Hadoop clusters. Our scheduler addresses several challenges such as job size estimation, resource management and scheduling complex jobs with inter-related phases. Furthermore, by employing the technique of job aging, we avoid the problem of job starvation typical of well known size-based policies. Our experiments pinpoint at a significant decrease in average job sojourn times -- a metric that accounts for the total time a job spends in the system, including waiting and serving times -- for realistic workloads generated according to production traces available in literature
Abstract—Size-based schedulers have very desirable performance properties: optimal or near-optimal r...
We study size-based schedulers, and focus on the impact of inaccurate job size information on respon...
The exponential growth of collected data poses the challenge of efficient data processing among othe...
Abstract—Size-based scheduling with aging has, for long, been recognized as an effective approach to...
Size-based scheduling with aging has, for long, been recognized as an effective approach to guarante...
Size-based scheduling with aging has been recognized as an effective approach to guarantee fairness ...
Size-based scheduling with aging has been recognized as an effective approach to guarantee fairness ...
The past decade have seen the rise of data-intensive scalable computing (DISC) systems, such as Hado...
Hadoop is a free, Java-based programming system that backings the preparing of vast informational co...
The majority of large-scale data severe applications executed by data centers are based on MapReduce...
La dernière décennie a vu l’émergence de systèmes parallèles pour l’analyse de grosse quantités de d...
International audienceHadoop has been recently used to process a diverse variety of applications, sh...
Size-based schedulers have very desirable performance properties: optimal or near-optimal response t...
Today scenario, we live in the data age and a key metric of existing times is the amount of data tha...
At present, big data is very popular, because it has proved to be much successful in many fields suc...
Abstract—Size-based schedulers have very desirable performance properties: optimal or near-optimal r...
We study size-based schedulers, and focus on the impact of inaccurate job size information on respon...
The exponential growth of collected data poses the challenge of efficient data processing among othe...
Abstract—Size-based scheduling with aging has, for long, been recognized as an effective approach to...
Size-based scheduling with aging has, for long, been recognized as an effective approach to guarante...
Size-based scheduling with aging has been recognized as an effective approach to guarantee fairness ...
Size-based scheduling with aging has been recognized as an effective approach to guarantee fairness ...
The past decade have seen the rise of data-intensive scalable computing (DISC) systems, such as Hado...
Hadoop is a free, Java-based programming system that backings the preparing of vast informational co...
The majority of large-scale data severe applications executed by data centers are based on MapReduce...
La dernière décennie a vu l’émergence de systèmes parallèles pour l’analyse de grosse quantités de d...
International audienceHadoop has been recently used to process a diverse variety of applications, sh...
Size-based schedulers have very desirable performance properties: optimal or near-optimal response t...
Today scenario, we live in the data age and a key metric of existing times is the amount of data tha...
At present, big data is very popular, because it has proved to be much successful in many fields suc...
Abstract—Size-based schedulers have very desirable performance properties: optimal or near-optimal r...
We study size-based schedulers, and focus on the impact of inaccurate job size information on respon...
The exponential growth of collected data poses the challenge of efficient data processing among othe...