Resource capacity is often over provisioned to primarily deal with short periods of peak load. Shaping these peaks by shifting them to low utilization periods (valleys) is referred to as "resource consumption shaping". While originally aimed at the data center level, the resource consumption shaping we consider focuses on local resources, like CPU or I/O as we have identified that individual jobs also incur load peaks and valleys on these resources. In this paper, we present Local Resource Shaper (LRS), which limits fairness in resource sharing between co-located MapReduce tasks. LRS enables Hadoop to maximize resource utilization and minimize resource contention independently of job type. Co-located MapReduce tasks are often prone to resou...
In this paper we present a MapReduce task scheduler for shared environments in which MapReduce is ex...
MapReduce has become a popular high performance computing paradigm for large-scale data processing. ...
Abstract—The majority of large-scale data intensive appli-cations executed by data centers are based...
Today, resource capacity is no longer an issue for running large-scale distributed systems, such as ...
Hadoop, an open source implementation of MapReduce, uses slots to represent resource sharing. The nu...
The MapReduce framework and its open source implementation Hadoop have become the defacto platform f...
Part 4: Green Computing and Resource ManagementInternational audienceWe present a resource-aware sch...
Running multiple instances of the MapReduce framework concurrently in a multicluster system or datac...
MapReduce is a popular parallel computing paradigm for large-scale data processing in clusters and d...
Over the last ten years MapReduce has emerged as one of the staples of distributed computing both in...
AbstractInspired by the victory of Apache's Hadoop this paper suggests a new reduce task scheduler. ...
For large scale parallel applications Mapreduce is a widely used programming model. Mapreduce is an ...
Applications in many areas are increasingly developed and ported using the MapReduce framework (more...
As distributed computing systems are used more widely, driven by trends such as 'big data' and cloud...
Abstract-MapReduce has become a popular model for largescale data processing in recent years. Howeve...
In this paper we present a MapReduce task scheduler for shared environments in which MapReduce is ex...
MapReduce has become a popular high performance computing paradigm for large-scale data processing. ...
Abstract—The majority of large-scale data intensive appli-cations executed by data centers are based...
Today, resource capacity is no longer an issue for running large-scale distributed systems, such as ...
Hadoop, an open source implementation of MapReduce, uses slots to represent resource sharing. The nu...
The MapReduce framework and its open source implementation Hadoop have become the defacto platform f...
Part 4: Green Computing and Resource ManagementInternational audienceWe present a resource-aware sch...
Running multiple instances of the MapReduce framework concurrently in a multicluster system or datac...
MapReduce is a popular parallel computing paradigm for large-scale data processing in clusters and d...
Over the last ten years MapReduce has emerged as one of the staples of distributed computing both in...
AbstractInspired by the victory of Apache's Hadoop this paper suggests a new reduce task scheduler. ...
For large scale parallel applications Mapreduce is a widely used programming model. Mapreduce is an ...
Applications in many areas are increasingly developed and ported using the MapReduce framework (more...
As distributed computing systems are used more widely, driven by trends such as 'big data' and cloud...
Abstract-MapReduce has become a popular model for largescale data processing in recent years. Howeve...
In this paper we present a MapReduce task scheduler for shared environments in which MapReduce is ex...
MapReduce has become a popular high performance computing paradigm for large-scale data processing. ...
Abstract—The majority of large-scale data intensive appli-cations executed by data centers are based...