An Efficient Key Partitioning Scheme for Heterogeneous MapReduce Clusters

?????????

Open link

Publication date

January 2016

DOI

10.1109/ICACT.2016.7423394

Publisher

Global IT Research Institute

Journal

1738-9445

Abstract

Hadoop is a standard implementation of MapReduce framework for running data-intensive applications on the clusters of commodity servers. By thoroughly studying the framework we find out that the shuffle phase, all-to-all input data fetching phase in reduce task significantly affect the application performance. There is a problem of variance in both the intermediate key's frequencies and their distribution among data nodes throughout the cluster in Hadoop's MapReduce system. This variance in system causes network overhead which leads to unfairness on the reduce input among different data nodes in the cluster. Because of the above problem, applications experience performance degradation due to shuffle phase of MapReduce applications. ...

Extracted data

We use cookies to provide a better user experience.

Data Protection

An Efficient Key Partitioning Scheme for Heterogeneous MapReduce Clusters

Abstract

Extracted data

An Efficient Key Partitioning Scheme for Heterogeneous MapReduce Clusters

Abstract

Extracted data

Related items

Related items