Optimization strategies for a/b testing on hadoop

Andrii Cherniak
Huma Zaidi
Vladimir Zadorozhny

Publication date

November 2015

Abstract

In this work, we present a set of techniques that considerably improve the performance of executing concurrent MapRe-duce jobs. Our proposed solution relies on proper resource allocation for concurrent Hive jobs based on data depen-dency, inter-query optimization and modeling of Hadoop cluster load. To the best of our knowledge, this is the first work towards Hive/MapReduce job optimization which takes Hadoop cluster load into consideration. We perform an experimental study that demonstrates 233% reduction in execution time for concurrent vs sequential ex-ecution schema. We report up to 40 % extra reduction in execution time for concurrent job execution after resource usage optimization. The results reported in this paper were obtained in a...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Optimization strategies for a/b testing on hadoop

Abstract

Extracted data

Optimization strategies for a/b testing on hadoop

Abstract

Extracted data

Related items

Related items