In this paper we examine the issue of optimizing disk usage and scheduling large-scale scientific workflows onto distributed resources where the workflows are data-intensive, requiring large amounts of data storage, and the resources have limited storage resources. Our approach is two-fold: we minimize the amount of space a workflow requires during execution by removing data files at runtime when they are no longer needed and we demonstrate that workflows may have to be restructured to reduce the overall data footprint of the workflow. We show the results of our data management and workflow restructuring solutions using a Laser Interferometer Gravitational-Wave Observatory (LIGO) application and an astronomy application, Montage, running on...
International audienceWorkflows may be defined as abstractions used to model the coherent flow of ac...
Abstract Cloud computing has empowered users to provision virtually unlimited computational resource...
Abstract — Scientific applications often perform complex computational analyses that consume and pro...
In this paper we examine the issue of optimizing disk usage and scheduling large-scale scientific wo...
Scientific workflows are often used to automate large-scale data analysis pipelines on clusters, gri...
In this paper we examine the issue of optimizing disk usage and of scheduling large-scale scientific...
Large-scale applications can be expressed as a set of tasks with data dependencies between them, als...
Scientists in different fields, such as high energy physics, earth science, and astronomy are develo...
The scale of scientific applications becomes increasingly large not only in computation, but also in...
UnrestrictedIn recent years, scientific communities have increasingly adopted computational workflow...
The development of grid and workflow technologies has enabled complex, loosely coupled scientific ap...
Scientific exploration demands heavy usage of computational resources for large-scale and deep analy...
Cloud computing has empowered users to provision virtually unlimited computational resources and are...
Scientific workflows feature complex precedence constraints that are mostly dictated by data depende...
International audienceLarge-scale, data-intensive scientific applications are often expressed as sci...
International audienceWorkflows may be defined as abstractions used to model the coherent flow of ac...
Abstract Cloud computing has empowered users to provision virtually unlimited computational resource...
Abstract — Scientific applications often perform complex computational analyses that consume and pro...
In this paper we examine the issue of optimizing disk usage and scheduling large-scale scientific wo...
Scientific workflows are often used to automate large-scale data analysis pipelines on clusters, gri...
In this paper we examine the issue of optimizing disk usage and of scheduling large-scale scientific...
Large-scale applications can be expressed as a set of tasks with data dependencies between them, als...
Scientists in different fields, such as high energy physics, earth science, and astronomy are develo...
The scale of scientific applications becomes increasingly large not only in computation, but also in...
UnrestrictedIn recent years, scientific communities have increasingly adopted computational workflow...
The development of grid and workflow technologies has enabled complex, loosely coupled scientific ap...
Scientific exploration demands heavy usage of computational resources for large-scale and deep analy...
Cloud computing has empowered users to provision virtually unlimited computational resources and are...
Scientific workflows feature complex precedence constraints that are mostly dictated by data depende...
International audienceLarge-scale, data-intensive scientific applications are often expressed as sci...
International audienceWorkflows may be defined as abstractions used to model the coherent flow of ac...
Abstract Cloud computing has empowered users to provision virtually unlimited computational resource...
Abstract — Scientific applications often perform complex computational analyses that consume and pro...