Session 8: Potpourri (Short Paper)YARN is a popular cluster resource management platform. It does not, however, manage the network bandwidth resources which can significantly affect the execution performance of those tasks having large volumes of data to transfer within the cluster. The shuffle phase of MapReduce jobs features many such tasks. The impact of underutilization of the network bandwidth in shuffle tasks is more pronounced if the network bandwidth capacities of the nodes in the cluster are varied. We present BAShuffler, a bandwidth-aware shuffle scheduler, that can maximize the overall network bandwidth utilization by scheduling the source nodes of the fetch flows at the application level. BAShuffler can fully utilize the net-wor...
Data skew, cluster heterogeneity, and network traffic are three issues that significantly influence ...
International audienceBig data analytics is an indispensable tool in transforming science, engineeri...
[[abstract]]Newly popular Internet applications such as WebTV and Internet streaming requires networ...
International audienceWhether it is for e-science or business, the amount of data produced every yea...
In the context of Hadoop, recent studies show that the shuffle operation accounts for as much as a t...
The MapReduce framework has become the defacto scheme for scalable semi-structured and un-structured...
Hadoop is a popular implementation of the MapReduce framework for running data-intensive jobs on clu...
In the last year, Hadoop YARN has become the defacto standard resource management platform for data-...
The MapReduce framework has become the de facto scheme for scalable semi-structured and un-structure...
Open AccessHadoop version 1 (HadoopV1) and version 2 (YARN) manage the resources in a distributed sy...
During the shuffle stage of the MapReduce framework, a large volume of data may be relocated to the ...
Abstract — Peer-to-peer computing, the harnessing of idle compute cycles throughout the Internet, of...
Abstract—MapReduce has become an important distributed processing model for large-scale data-intensi...
To reduce the impact of network congestion on big data jobs, cluster management frameworks use vario...
AbstractMapReduce has become an important distributed processing model for large-scale data-intensiv...
Data skew, cluster heterogeneity, and network traffic are three issues that significantly influence ...
International audienceBig data analytics is an indispensable tool in transforming science, engineeri...
[[abstract]]Newly popular Internet applications such as WebTV and Internet streaming requires networ...
International audienceWhether it is for e-science or business, the amount of data produced every yea...
In the context of Hadoop, recent studies show that the shuffle operation accounts for as much as a t...
The MapReduce framework has become the defacto scheme for scalable semi-structured and un-structured...
Hadoop is a popular implementation of the MapReduce framework for running data-intensive jobs on clu...
In the last year, Hadoop YARN has become the defacto standard resource management platform for data-...
The MapReduce framework has become the de facto scheme for scalable semi-structured and un-structure...
Open AccessHadoop version 1 (HadoopV1) and version 2 (YARN) manage the resources in a distributed sy...
During the shuffle stage of the MapReduce framework, a large volume of data may be relocated to the ...
Abstract — Peer-to-peer computing, the harnessing of idle compute cycles throughout the Internet, of...
Abstract—MapReduce has become an important distributed processing model for large-scale data-intensi...
To reduce the impact of network congestion on big data jobs, cluster management frameworks use vario...
AbstractMapReduce has become an important distributed processing model for large-scale data-intensiv...
Data skew, cluster heterogeneity, and network traffic are three issues that significantly influence ...
International audienceBig data analytics is an indispensable tool in transforming science, engineeri...
[[abstract]]Newly popular Internet applications such as WebTV and Internet streaming requires networ...