International audienceThere is a trade-off between the number of concurrently running MapReduce jobs and their corresponding map and reduce tasks within a node in a Hadoop cluster. Leaving this trade-off statically configured to a single value can significantly reduce job response times leaving only suboptimal resource usage. To overcome this problem, we propose a feedback control loop based approach that dynamically adjusts the Hadoop resource manager configuration based on the current state of the cluster. The preliminary assessment based on workloads synthesized from real-world traces shows that the system performance can be improved by about 30% compared to default Hadoop setup
Over the last ten years MapReduce has emerged as one of the staples of distributed computing both in...
Running multiple instances of the MapReduce framework concurrently in a multicluster system or datac...
Hadoop YARN is an Apache Software Foundation\u27s open project that provides a resource management f...
International audienceIn Hadoop cluster, the performance and the resource consumption of MapReduce j...
The MapReduce framework and its open source implementation Hadoop have become the defacto platform f...
Big data is an emerging concept involving complex data sets which can give new insight and distill n...
International audienceMapReduce is a popular programming model for distributed data processing and B...
Abstract—One of the most widely used frameworks for programming MapReduce-based applications is Apac...
Open AccessHadoop version 1 (HadoopV1) and version 2 (YARN) manage the resources in a distributed sy...
In this paper, we address the problem caused by fixed assignment of task slots in Hadoop MapReduce. ...
International audienceCompanies have a fast growing amounts of data to process and store, a data exp...
At present MapReduce computing model‐based Hadoop framework has gradually become the most famous dis...
International audienceContainers are considered an optimized fine-grain alternative to virtual machi...
MapReduce is a popular parallel computing paradigm for large-scale data processing in clusters and d...
Resource capacity is often over provisioned to primarily deal with short periods of peak load. Shapi...
Over the last ten years MapReduce has emerged as one of the staples of distributed computing both in...
Running multiple instances of the MapReduce framework concurrently in a multicluster system or datac...
Hadoop YARN is an Apache Software Foundation\u27s open project that provides a resource management f...
International audienceIn Hadoop cluster, the performance and the resource consumption of MapReduce j...
The MapReduce framework and its open source implementation Hadoop have become the defacto platform f...
Big data is an emerging concept involving complex data sets which can give new insight and distill n...
International audienceMapReduce is a popular programming model for distributed data processing and B...
Abstract—One of the most widely used frameworks for programming MapReduce-based applications is Apac...
Open AccessHadoop version 1 (HadoopV1) and version 2 (YARN) manage the resources in a distributed sy...
In this paper, we address the problem caused by fixed assignment of task slots in Hadoop MapReduce. ...
International audienceCompanies have a fast growing amounts of data to process and store, a data exp...
At present MapReduce computing model‐based Hadoop framework has gradually become the most famous dis...
International audienceContainers are considered an optimized fine-grain alternative to virtual machi...
MapReduce is a popular parallel computing paradigm for large-scale data processing in clusters and d...
Resource capacity is often over provisioned to primarily deal with short periods of peak load. Shapi...
Over the last ten years MapReduce has emerged as one of the staples of distributed computing both in...
Running multiple instances of the MapReduce framework concurrently in a multicluster system or datac...
Hadoop YARN is an Apache Software Foundation\u27s open project that provides a resource management f...