Modern distributed computing frameworks such as Apache Hadoop, Spark, or Storm distribute the workload of applications across a large number of machines. Whilst they abstract the details of distribution they do require the programmer to set a number of configuration parameters before deployment. These parameter settings (usually) have a substantial impact on execution efficiency. Finding the right values for these parameters is considered a difficult task and requires domain, application, and framework expertise. In this paper, we propose a machine learning approach to the problem of configuring a distributed computing framework. Specifically, we propose using Bayesian Optimization to find good parameter settings. In an extensive empirical...
Bayesian Optimization (BO) is an efficient method for finding optimal cloud computing configurations...
Apache Spark is an open source distributed platform which uses the concept of distributed memory for...
As computer systems continue to increase in complexity, the need for AI-based solutions is becoming ...
Finding optimal configurations for Stream Processing Systems (SPS) is a challenging problem due to t...
The complexity and diversity of today's architectures require an additional effort from the programm...
As more aspects of our daily lives are being computerized, ever larger amounts of data are being pro...
This thesis addresses many open challenges in hyperparameter tuning of machine learning algorithms. ...
Thesis (Ph.D.)--University of Washington, 2019Distributed systems consist of many components that in...
The prosperity of Big Data owes to the advances in distributed computing systems, which make it poss...
A current challenge for data management systems is to support the construction and maintenance of ma...
The use of machine learning algorithms frequently involves careful tuning of learning parameters and...
The use of machine learning algorithms frequently involves careful tuning of learning parameters and...
A current challenge for data management systems is to support the construction and maintenance of ma...
Big data processing systems (e.g., Hadoop, Spark, Storm) contain a vast number of configuration para...
Deep neural networks have recently become astonishingly successful at many machine learning problems...
Bayesian Optimization (BO) is an efficient method for finding optimal cloud computing configurations...
Apache Spark is an open source distributed platform which uses the concept of distributed memory for...
As computer systems continue to increase in complexity, the need for AI-based solutions is becoming ...
Finding optimal configurations for Stream Processing Systems (SPS) is a challenging problem due to t...
The complexity and diversity of today's architectures require an additional effort from the programm...
As more aspects of our daily lives are being computerized, ever larger amounts of data are being pro...
This thesis addresses many open challenges in hyperparameter tuning of machine learning algorithms. ...
Thesis (Ph.D.)--University of Washington, 2019Distributed systems consist of many components that in...
The prosperity of Big Data owes to the advances in distributed computing systems, which make it poss...
A current challenge for data management systems is to support the construction and maintenance of ma...
The use of machine learning algorithms frequently involves careful tuning of learning parameters and...
The use of machine learning algorithms frequently involves careful tuning of learning parameters and...
A current challenge for data management systems is to support the construction and maintenance of ma...
Big data processing systems (e.g., Hadoop, Spark, Storm) contain a vast number of configuration para...
Deep neural networks have recently become astonishingly successful at many machine learning problems...
Bayesian Optimization (BO) is an efficient method for finding optimal cloud computing configurations...
Apache Spark is an open source distributed platform which uses the concept of distributed memory for...
As computer systems continue to increase in complexity, the need for AI-based solutions is becoming ...