Provenance, the metadata that records the derivation history of scientific results, is important in scientific workflows to interpret, validate, and analyze the result of scientific computing. Recently, to promote and facilitate interoperability among heterogeneous provenance systems, the Open Provenance Model (OPM) has been proposed and has played an important role in the community. In this dissertation, to efficiently query and manage OPM-compliant provenance, we first propose a provenance collection framework that collects both prospective provenance, which captures an abstract workflow specification as a recipe for future data derivation and retrospective provenance, which captures past workflow execution and data derivation information...
Scientific experiments are becoming increasingly large and complex, with a commensurate increase in ...
In the era of big data, scientific workflows have become essential to automate scientific experiment...
Scientists can facilitate data intensive applications to study and understand the behavior of a comp...
Provenance, the metadata that records the derivation history of scientific results, is important in ...
dissertationServing as a record of what happened during a scientific process, often computational, p...
Collaborative data science activities are becoming pervasive in a variety of communities, and are of...
Within computer science, the term provenance has multiple meanings, due to different motivations, pe...
Workflow provenance is a crucial part of a workflow system as it enables data lineage analysis, erro...
The importance of maintaining provenance has been widely recognized. Currently there are two approac...
Journal ArticleThe automated tracking and storage of provenance information promises to be a major a...
The provenance of a data product contains information about how the product was derived, and is cruc...
Data provenance tools seek to facilitate reproducible data science and auditable data analyses by ca...
Scientific research has moved from an isolated environment into a collaborated culture due to the da...
Integrated provenance support promises to be a chief advantage of scientific workflow systems over s...
The Open Provenance Architecture (OPA) approach to the challenge was distinct in several regards. In...
Scientific experiments are becoming increasingly large and complex, with a commensurate increase in ...
In the era of big data, scientific workflows have become essential to automate scientific experiment...
Scientists can facilitate data intensive applications to study and understand the behavior of a comp...
Provenance, the metadata that records the derivation history of scientific results, is important in ...
dissertationServing as a record of what happened during a scientific process, often computational, p...
Collaborative data science activities are becoming pervasive in a variety of communities, and are of...
Within computer science, the term provenance has multiple meanings, due to different motivations, pe...
Workflow provenance is a crucial part of a workflow system as it enables data lineage analysis, erro...
The importance of maintaining provenance has been widely recognized. Currently there are two approac...
Journal ArticleThe automated tracking and storage of provenance information promises to be a major a...
The provenance of a data product contains information about how the product was derived, and is cruc...
Data provenance tools seek to facilitate reproducible data science and auditable data analyses by ca...
Scientific research has moved from an isolated environment into a collaborated culture due to the da...
Integrated provenance support promises to be a chief advantage of scientific workflow systems over s...
The Open Provenance Architecture (OPA) approach to the challenge was distinct in several regards. In...
Scientific experiments are becoming increasingly large and complex, with a commensurate increase in ...
In the era of big data, scientific workflows have become essential to automate scientific experiment...
Scientists can facilitate data intensive applications to study and understand the behavior of a comp...