With the increasing amount of data available to scientists in disciplines as diverse as bioinformatics, physics, and remote sensing, scientific workflow systems are becoming increasingly important for composing and executing scalable data analysis pipelines. When writing such workflows, users need to specify the resources to be reserved for tasks so that sufficient resources are allocated on the target cluster infrastructure. Crucially, underestimating a task's memory requirements can result in task failures. Therefore, users often resort to overprovisioning, resulting in significant resource wastage and decreased throughput. In this paper, we propose a novel online method that uses monitoring time series data to predict task memory usage...
Scientific workflows are often used to automate large-scale data analysis pipelines on clusters, gri...
This work presents a scale-space based approach to assist dynamic resource provisioning. The applica...
Proactive auto-scaling techniques aim to predict the future workload of web applications to provis...
With the increasing amount of data available to scientists in disciplines as diverse as bioinformat...
Estimates of task runtime, disk space usage, and memory consumption, are commonly used by scheduling...
Task characteristics estimations such as runtime, disk space, and memory consumption, are commonly u...
Many scientific workflow scheduling algorithms need to be informed about task runtimes a-priori to c...
In this paper we propose a novel method for auto-scaling data-centric workflow tasks. Scaling is ach...
Scientific workflows are designed as directed acyclic graphs (DAGs) and consist of multiple dependen...
Many scientific workflow scheduling algorithms need to be informed about task runtimes a-priori to c...
The progression of scientific data leads to an increase in the demand of powerful high performance c...
In this paper we examine the issue of optimizing disk usage and scheduling large-scale scientific wo...
Machine learning algorithms are widely used today for analytical tasks such as data cleaning, data c...
Scientific workflow management systems like Nextflow support large-scale data analysis by abstractin...
Infrastructure as a service clouds hide the complexity of maintaining the physical infrastructure wi...
Scientific workflows are often used to automate large-scale data analysis pipelines on clusters, gri...
This work presents a scale-space based approach to assist dynamic resource provisioning. The applica...
Proactive auto-scaling techniques aim to predict the future workload of web applications to provis...
With the increasing amount of data available to scientists in disciplines as diverse as bioinformat...
Estimates of task runtime, disk space usage, and memory consumption, are commonly used by scheduling...
Task characteristics estimations such as runtime, disk space, and memory consumption, are commonly u...
Many scientific workflow scheduling algorithms need to be informed about task runtimes a-priori to c...
In this paper we propose a novel method for auto-scaling data-centric workflow tasks. Scaling is ach...
Scientific workflows are designed as directed acyclic graphs (DAGs) and consist of multiple dependen...
Many scientific workflow scheduling algorithms need to be informed about task runtimes a-priori to c...
The progression of scientific data leads to an increase in the demand of powerful high performance c...
In this paper we examine the issue of optimizing disk usage and scheduling large-scale scientific wo...
Machine learning algorithms are widely used today for analytical tasks such as data cleaning, data c...
Scientific workflow management systems like Nextflow support large-scale data analysis by abstractin...
Infrastructure as a service clouds hide the complexity of maintaining the physical infrastructure wi...
Scientific workflows are often used to automate large-scale data analysis pipelines on clusters, gri...
This work presents a scale-space based approach to assist dynamic resource provisioning. The applica...
Proactive auto-scaling techniques aim to predict the future workload of web applications to provis...