Apache Spark is a popular open-source distributed data processing framework that can efficiently process massive amounts of data. It provides more than 180 configuration parameters for users to manually select the appropriate parameter values according to their own experience. However, due to the large number of parameters and the inherent correlation between them, manual tuning is very tedious. To solve the problem of tuning through personal experience, we designed and implemented a reinforcement-learning-based Spark configuration parameter optimizer. First, we trained a Spark application performance prediction model with deep neural networks, and verified the accuracy and effectiveness of the model from multiple perspectives. Second, in o...
Spark has gained growing attention in the past couple of years as an in-memory cloud computing platf...
Along with the explosive growth of data, there is a great demand to speedup the ability to process t...
For the past two years, Hopsworks, an open-source machine learning platform, has used Apache Spark t...
Apache spark, famously known for big data handling ability, is a distributed open-source framework t...
Apache spark, famously known for big data handling ability, is a distributed open-source framework t...
Apache Spark is a popular open-source distributed processing framework that enables efficient proces...
Apache Spark is an open source distributed platform which uses the concept of distributed memory for...
Tuning configurations of Spark jobs is not a trivial task. State-of-the-art auto-tuning systems are ...
Nowadays, Spark Streaming, a computing framework based on Spark, is widely used to process streaming...
The distributed data analytic system - Spark is a common choice for processing massive volumes of he...
Apache Spark is amazing when everything clicks. But if you haven’t seen the performance improvements...
In the era of Big Data, machine learning has taken on a whole new role. With the amount of data pres...
Spark has been established as an attractive platform for big data analysis, since it manages to hide...
Spark has been established as an attractive platform for big data analysis, since it manages to hide...
The distributed data analytic system -- Spark is a common choice for processing massive volumes of h...
Spark has gained growing attention in the past couple of years as an in-memory cloud computing platf...
Along with the explosive growth of data, there is a great demand to speedup the ability to process t...
For the past two years, Hopsworks, an open-source machine learning platform, has used Apache Spark t...
Apache spark, famously known for big data handling ability, is a distributed open-source framework t...
Apache spark, famously known for big data handling ability, is a distributed open-source framework t...
Apache Spark is a popular open-source distributed processing framework that enables efficient proces...
Apache Spark is an open source distributed platform which uses the concept of distributed memory for...
Tuning configurations of Spark jobs is not a trivial task. State-of-the-art auto-tuning systems are ...
Nowadays, Spark Streaming, a computing framework based on Spark, is widely used to process streaming...
The distributed data analytic system - Spark is a common choice for processing massive volumes of he...
Apache Spark is amazing when everything clicks. But if you haven’t seen the performance improvements...
In the era of Big Data, machine learning has taken on a whole new role. With the amount of data pres...
Spark has been established as an attractive platform for big data analysis, since it manages to hide...
Spark has been established as an attractive platform for big data analysis, since it manages to hide...
The distributed data analytic system -- Spark is a common choice for processing massive volumes of h...
Spark has gained growing attention in the past couple of years as an in-memory cloud computing platf...
Along with the explosive growth of data, there is a great demand to speedup the ability to process t...
For the past two years, Hopsworks, an open-source machine learning platform, has used Apache Spark t...