In stream data processing, data arrives continuously and is processed by decision making, process control and e-science applications. To control and monitor these applications, reproducibility of result is a vital requirement. However, it requires massive amount of storage space to store fine-grained provenance data especially for those transformations with overlapping sliding windows. In this paper, we propose techniques which can significantly reduce storage costs and can achieve high accuracy. Our evaluation shows that adaptive inference technique can achieve almost 100% accurate provenance information for a given dataset at lower storage costs than the other techniques. Moreover, we present a guideline about the usage of different prove...
Many applications now involve the collection of large amounts of data from multiple users, and then ...
Provenance describes how results are produced starting from data sources, curation, recovery, interm...
Data provenance tools seek to facilitate reproducible data science and auditable data analyses by ca...
In stream data processing, data arrives continuously and is processed by decision making, process co...
Fine-grained data provenance ensures reproducibility of results in decision making, process control ...
Scientists can facilitate data intensive applications to study and understand the behavior of a comp...
Applications that require continuous processing of high-volume data streams have grown in prevalence...
Decision making, process control and e-science applications process stream data, mostly produced by ...
Data provenance tools seek to facilitate reproducible data science and auditable data analyses by ca...
The description of the origins of a piece of data and the transformations by which it arrived in a d...
E-science applications use fine grained data provenance to maintain the reproducibility of scientifi...
Managing fine-grained provenance is a critical requirement for data stream management systems (DSMS)...
Managing fine-grained provenance is a critical requirement for data stream management systems (DSMS)...
Provenance is a type of meta-data that describes the history or ancestry of an object. Although prov...
Applications that operate over streaming data withhigh-volume and real-time processing requirements ...
Many applications now involve the collection of large amounts of data from multiple users, and then ...
Provenance describes how results are produced starting from data sources, curation, recovery, interm...
Data provenance tools seek to facilitate reproducible data science and auditable data analyses by ca...
In stream data processing, data arrives continuously and is processed by decision making, process co...
Fine-grained data provenance ensures reproducibility of results in decision making, process control ...
Scientists can facilitate data intensive applications to study and understand the behavior of a comp...
Applications that require continuous processing of high-volume data streams have grown in prevalence...
Decision making, process control and e-science applications process stream data, mostly produced by ...
Data provenance tools seek to facilitate reproducible data science and auditable data analyses by ca...
The description of the origins of a piece of data and the transformations by which it arrived in a d...
E-science applications use fine grained data provenance to maintain the reproducibility of scientifi...
Managing fine-grained provenance is a critical requirement for data stream management systems (DSMS)...
Managing fine-grained provenance is a critical requirement for data stream management systems (DSMS)...
Provenance is a type of meta-data that describes the history or ancestry of an object. Although prov...
Applications that operate over streaming data withhigh-volume and real-time processing requirements ...
Many applications now involve the collection of large amounts of data from multiple users, and then ...
Provenance describes how results are produced starting from data sources, curation, recovery, interm...
Data provenance tools seek to facilitate reproducible data science and auditable data analyses by ca...