The distributed data analytic system -- Spark is a common choice for processing massive volumes of heterogeneous data, while it is challenging to tune its parameters to achieve high performance. Recent studies try to employ auto-tuning techniques to solve this problem but suffer from three issues: limited functionality, high overhead, and inefficient search. In this paper, we present a general and efficient Spark tuning framework that can deal with the three issues simultaneously. First, we introduce a generalized tuning formulation, which can support multiple tuning goals and constraints conveniently, and a Bayesian optimization (BO) based solution to solve this generalized optimization problem. Second, to avoid high overhead from additi...
Big data processing systems (e.g., Hadoop, Spark, Storm) contain a vast number of configuration para...
International audienceData analytics in the cloud has become an integral part of enterprise business...
International audienceData analytics in the cloud has become an integral part of enterprise business...
The distributed data analytic system - Spark is a common choice for processing massive volumes of he...
Tuning configurations of Spark jobs is not a trivial task. State-of-the-art auto-tuning systems are ...
Spark has been established as an attractive platform for big data analysis, since it manages to hide...
Database and big data analytics systems such as Hadoop and Spark have a large number of configuratio...
Apache spark, famously known for big data handling ability, is a distributed open-source framework t...
International audienceData analytics in the cloud has become an integral part of enterprise business...
Hadoop provides a scalable solution on traditional cluster-based Big Data platforms but imposes per...
Spark has been established as an attractive platform for big data analysis, since it manages to hide...
Apache Spark is a popular open-source distributed data processing framework that can efficiently pro...
Apache spark, famously known for big data handling ability, is a distributed open-source framework t...
International audienceData analytics in the cloud has become an integral part of enterprise business...
Apache Spark is a popular open-source distributed processing framework that enables efficient proces...
Big data processing systems (e.g., Hadoop, Spark, Storm) contain a vast number of configuration para...
International audienceData analytics in the cloud has become an integral part of enterprise business...
International audienceData analytics in the cloud has become an integral part of enterprise business...
The distributed data analytic system - Spark is a common choice for processing massive volumes of he...
Tuning configurations of Spark jobs is not a trivial task. State-of-the-art auto-tuning systems are ...
Spark has been established as an attractive platform for big data analysis, since it manages to hide...
Database and big data analytics systems such as Hadoop and Spark have a large number of configuratio...
Apache spark, famously known for big data handling ability, is a distributed open-source framework t...
International audienceData analytics in the cloud has become an integral part of enterprise business...
Hadoop provides a scalable solution on traditional cluster-based Big Data platforms but imposes per...
Spark has been established as an attractive platform for big data analysis, since it manages to hide...
Apache Spark is a popular open-source distributed data processing framework that can efficiently pro...
Apache spark, famously known for big data handling ability, is a distributed open-source framework t...
International audienceData analytics in the cloud has become an integral part of enterprise business...
Apache Spark is a popular open-source distributed processing framework that enables efficient proces...
Big data processing systems (e.g., Hadoop, Spark, Storm) contain a vast number of configuration para...
International audienceData analytics in the cloud has become an integral part of enterprise business...
International audienceData analytics in the cloud has become an integral part of enterprise business...