Abstract Window functions have been a part of the SQL standard since 2003 and have been studied extensively during the past decade. They are widely used in data analysis; almost all the current mainstream commercial databases support window functions. However, in recent years the size of datasets is growing steeply; the existing window function implementations are not efficient enough. Recently, some sampling-based algorithms (e.g., online aggregation) are proposed to deal with large and complex data in relational databases, which offer us a flexible trade-off between accuracy and efficiency. However, few sampling techniques has been considered for window functions in databases. In this paper, we extend our previous work (Song et al. in Asi...
Sliding windows are bounded sets which evolve together with an infinite data stream of records. Each...
Window aggregation is a core operation in data stream processing. Existing aggregation techniques fo...
Random sampling is an appealing approach to build synopses of large data streams because random samp...
Analytic functions represent the state-of-the-art way of perform-ing complex data analysis within a ...
A large part of the data on the World Wide Web resides in the deep web. Executing structured, high-l...
AbstractRecently, we have proposed an adaptive, random-sampling algorithm for general query size est...
Existing SQL aggregate functions present important limi-tations to compute percentages. This article...
International audienceSampling streams of continuous data with limited memory, or reservoir sampling...
Approximate query processing is an adequate technique to reduce response times and system load in ca...
In modern applications, it is a big challenge that analyzing the order statistics about the most rec...
The concept of time-constrained SQL queries was introduced to address the problem of long-running SQ...
Perhaps the most flexible synopsis of a database is a uniform random sample of the data; such sample...
In this article, I show how to fit a generalized linear model to N observations on p variables store...
Query optimization is an important functionality of modern database systems and often based on estim...
Big data is now being utilized widely and developed rapidly. The researches on big data area is mean...
Sliding windows are bounded sets which evolve together with an infinite data stream of records. Each...
Window aggregation is a core operation in data stream processing. Existing aggregation techniques fo...
Random sampling is an appealing approach to build synopses of large data streams because random samp...
Analytic functions represent the state-of-the-art way of perform-ing complex data analysis within a ...
A large part of the data on the World Wide Web resides in the deep web. Executing structured, high-l...
AbstractRecently, we have proposed an adaptive, random-sampling algorithm for general query size est...
Existing SQL aggregate functions present important limi-tations to compute percentages. This article...
International audienceSampling streams of continuous data with limited memory, or reservoir sampling...
Approximate query processing is an adequate technique to reduce response times and system load in ca...
In modern applications, it is a big challenge that analyzing the order statistics about the most rec...
The concept of time-constrained SQL queries was introduced to address the problem of long-running SQ...
Perhaps the most flexible synopsis of a database is a uniform random sample of the data; such sample...
In this article, I show how to fit a generalized linear model to N observations on p variables store...
Query optimization is an important functionality of modern database systems and often based on estim...
Big data is now being utilized widely and developed rapidly. The researches on big data area is mean...
Sliding windows are bounded sets which evolve together with an infinite data stream of records. Each...
Window aggregation is a core operation in data stream processing. Existing aggregation techniques fo...
Random sampling is an appealing approach to build synopses of large data streams because random samp...