Join queries are a fundamental database tool, capturing a range of tasks that involve linking heterogeneous data sources. However, with massive table sizes, it is often impractical to keep these in memory, and we can only take one or few streaming passes over them. Moreover, building out the full join result (e.g., linking heterogeneous data sources along quasi-identifiers) can lead to a combinatorial explosion of results due to many-to-many links. Random sampling is a natural tool to boil this oversized result down to a representative subset with well-understood statistical properties, but turns out to be a challenging task due to the combinatorial nature of the sampling domain. Existing techniques in the literature focus solely on the set...
The existing random sampling methods have at least one of the following disadvantages: they 1) are a...
Abstract. Random sampling is a popular technique for providing fast approximate query answers, espec...
With the proliferation of the RDF data format, engines for RDF query processing are faced with very ...
Join queries are a fundamental database tool, capturing a range of tasks that involve linking hetero...
Modern databases face formidable challenges when called to join (several) massive tables. Joins (esp...
Uniform sampling of join orders is known to be a competitive alternative to transformation-based opt...
Thesis (Ph.D.)--University of Washington, 2021As the demand for data intensive pipelines has grown a...
Approximate query processing is an adequate technique to reduce response times and system load in ca...
Random sampling is a popular technique for providing fast approximate query answers, especially in d...
sganguly,minos,rastogi¡ Abstract. There is a growing interest in on-line algorithms for analyzing an...
Abstract. Random sampling is a popular technique for providing fast approximate query answers, espec...
Abstract. Uniform sampling of join orders is known to be a competitive alternative to transformation...
Semi-stream join algorithms join a fast stream input with a disk-based master data relation. A commo...
With the proliferation of the RDF data format, engines for RDF query processing are faced with very ...
The join operation combines information from multiple data sources. Efficient processing of join que...
The existing random sampling methods have at least one of the following disadvantages: they 1) are a...
Abstract. Random sampling is a popular technique for providing fast approximate query answers, espec...
With the proliferation of the RDF data format, engines for RDF query processing are faced with very ...
Join queries are a fundamental database tool, capturing a range of tasks that involve linking hetero...
Modern databases face formidable challenges when called to join (several) massive tables. Joins (esp...
Uniform sampling of join orders is known to be a competitive alternative to transformation-based opt...
Thesis (Ph.D.)--University of Washington, 2021As the demand for data intensive pipelines has grown a...
Approximate query processing is an adequate technique to reduce response times and system load in ca...
Random sampling is a popular technique for providing fast approximate query answers, especially in d...
sganguly,minos,rastogi¡ Abstract. There is a growing interest in on-line algorithms for analyzing an...
Abstract. Random sampling is a popular technique for providing fast approximate query answers, espec...
Abstract. Uniform sampling of join orders is known to be a competitive alternative to transformation...
Semi-stream join algorithms join a fast stream input with a disk-based master data relation. A commo...
With the proliferation of the RDF data format, engines for RDF query processing are faced with very ...
The join operation combines information from multiple data sources. Efficient processing of join que...
The existing random sampling methods have at least one of the following disadvantages: they 1) are a...
Abstract. Random sampling is a popular technique for providing fast approximate query answers, espec...
With the proliferation of the RDF data format, engines for RDF query processing are faced with very ...