With the increasing amount of data available to scientists in disciplines as diverse as bioinformatics, physics, and remote sensing, scientific workflow systems are becoming increasingly important for composing and executing scalable data analysis pipelines. When writing such workflows, users need to specify the resources to be reserved for tasks so that sufficient resources are allocated on the target cluster infrastructure. Crucially, underestimating a task’s memory requirements can result in task failures. Therefore, users often resort to overprovisioning, resulting in significant resource wastage and decreased throughput. In this paper, we propose a novel online method that uses monitoring time series data to predict task mem...
The ability to accurately estimate the execution time of computationally expensive e-science algorit...
Many techniques such as scheduling and resource provisioning rely on performance prediction of workf...
Scientific workflows typically comprise a multitude of different processing steps which often are ex...
With the increasing amount of data available to scientists in disciplines as diverse as bioinformati...
Estimates of task runtime, disk space usage, and memory consumption, are commonly used by scheduling...
Task characteristics estimations such as runtime, disk space, and memory consumption, are commonly u...
In this paper we propose a novel method for auto-scaling data-centric workflow tasks. Scaling is ach...
Many scientific workflow scheduling algorithms need to be informed about task runtimes a-priori to c...
Scientific workflows are often used to automate large-scale data analysis pipelines on clusters, gri...
Many scientific workflow scheduling algorithms need to be informed about task runtimes a-priori to c...
Abstract. System provisioning, resource allocation, and system con-figuration decisions for workflow...
The progression of scientific data leads to an increase in the demand of powerful high performance c...
Abstract—Scientific workflows, which capture large compu-tational problems, may be executed on large...
Scientific workflow management systems like Nextflow support large-scale data analysis by abstractin...
Dismal performance often results when the memory requirements of a process exceed the physical memor...
The ability to accurately estimate the execution time of computationally expensive e-science algorit...
Many techniques such as scheduling and resource provisioning rely on performance prediction of workf...
Scientific workflows typically comprise a multitude of different processing steps which often are ex...
With the increasing amount of data available to scientists in disciplines as diverse as bioinformati...
Estimates of task runtime, disk space usage, and memory consumption, are commonly used by scheduling...
Task characteristics estimations such as runtime, disk space, and memory consumption, are commonly u...
In this paper we propose a novel method for auto-scaling data-centric workflow tasks. Scaling is ach...
Many scientific workflow scheduling algorithms need to be informed about task runtimes a-priori to c...
Scientific workflows are often used to automate large-scale data analysis pipelines on clusters, gri...
Many scientific workflow scheduling algorithms need to be informed about task runtimes a-priori to c...
Abstract. System provisioning, resource allocation, and system con-figuration decisions for workflow...
The progression of scientific data leads to an increase in the demand of powerful high performance c...
Abstract—Scientific workflows, which capture large compu-tational problems, may be executed on large...
Scientific workflow management systems like Nextflow support large-scale data analysis by abstractin...
Dismal performance often results when the memory requirements of a process exceed the physical memor...
The ability to accurately estimate the execution time of computationally expensive e-science algorit...
Many techniques such as scheduling and resource provisioning rely on performance prediction of workf...
Scientific workflows typically comprise a multitude of different processing steps which often are ex...