The rich dependency structure found in the columns of real-world relational databases can be exploited to great advantage, but can also cause query optimizers---which usually assume that columns are statistically independent---to underestimate the selectivities of conjunctive predicates by orders of magnitude. We introduce CORDS, an efficient and scalable tool for automatic discovery of correlations and soft functional dependencies between columns. CORDS searches for column pairs that might have interesting and useful dependency relations by systematically enumerating candidate pairs and simultaneously pruning unpromising candidates using a flexible set of heuristics. A robust chi-squared analysis is applied to a sample of column values in ...
Given a user-specified minimum correlation threshold and a market basket database with N items and T...
Abstract. In this paper we study the problem of mining all frequent queries in a relational table, a...
Business-intelligence queries often involve SQL functions and algebraic expressions. There can be cl...
In relational query processing, there are generally two choices for access paths when performing a p...
Many relational databases exhibit complex dependencies between data attributes, caused either by the...
Given a set of data objects, correlation computing refers to the problem of efficiently finding grou...
One fundamental limitation of classical statistical modeling is the assumption that data is represen...
Given a database and a target attribute of interest, how can we tell whether there exists a function...
Data Mining (DM) represents the process of extracting interesting and previously unknown knowledge f...
Very little research in knowledge discovery has studied how to incorporate statistical methods to au...
International audienceWe address the issue of mining frequent conjunctive queries in a relational da...
We study the problem of mining correlated patterns. Correlated patterns have advantages over associa...
International audienceIn this paper, we propose a new efficient algorithm called Dep-Miner for disco...
Most unary relational database operators can be described through functions from tuples to tuples. ...
We describe an automatic database design tool that exploits correlations between attributes when rec...
Given a user-specified minimum correlation threshold and a market basket database with N items and T...
Abstract. In this paper we study the problem of mining all frequent queries in a relational table, a...
Business-intelligence queries often involve SQL functions and algebraic expressions. There can be cl...
In relational query processing, there are generally two choices for access paths when performing a p...
Many relational databases exhibit complex dependencies between data attributes, caused either by the...
Given a set of data objects, correlation computing refers to the problem of efficiently finding grou...
One fundamental limitation of classical statistical modeling is the assumption that data is represen...
Given a database and a target attribute of interest, how can we tell whether there exists a function...
Data Mining (DM) represents the process of extracting interesting and previously unknown knowledge f...
Very little research in knowledge discovery has studied how to incorporate statistical methods to au...
International audienceWe address the issue of mining frequent conjunctive queries in a relational da...
We study the problem of mining correlated patterns. Correlated patterns have advantages over associa...
International audienceIn this paper, we propose a new efficient algorithm called Dep-Miner for disco...
Most unary relational database operators can be described through functions from tuples to tuples. ...
We describe an automatic database design tool that exploits correlations between attributes when rec...
Given a user-specified minimum correlation threshold and a market basket database with N items and T...
Abstract. In this paper we study the problem of mining all frequent queries in a relational table, a...
Business-intelligence queries often involve SQL functions and algebraic expressions. There can be cl...