Database sampling is widely used in many database ap-plications when, for efficiency reasons, an entire database cannot be used. This paper analyses the use of consistent sampling, that is, sampling according to certain criteria (e.g. integrity constraints) used to evaluate the consis-tency of the resulting sample. This alternative to random sampling, the most common sampling strategy, is particu-larly appropriate in the context of constructing prototype databases to support Information System Development. The paper firstly presents a framework for evaluation of prototype database construction methods. Then a general description of the Consistent Database Sampling Process is introduced. Finally the paper outlines a sampling tool which imple...
Testing an SQL database system by running large sets of de-terministic or stochastic SQL statements ...
AbstractRecently, we have proposed an adaptive, random-sampling algorithm for general query size est...
We present an adaptive distributed query-sampling framework that is quality-conscious for extracting...
Database sampling is widely used in many database applications when, for eciency reasons, an entire ...
Managing large amounts of information is one of the most expensive, time-consuming and non-trivial a...
Managing large amounts of information is one of the most expensive, time-consuming and non-trivial a...
Abstract—In a wide range of application areas (e.g. data mining, approximate query evaluation, histo...
Populating the testing environment with relevant data represents a great challenge in software valid...
peer-reviewedDatabase sampling has become a popular approach to handle large amounts of data in a w...
In the wake of growing database that has already become the trend of today’s business environment wi...
Perhaps the most flexible synopsis of a database is a uniform random sample of the data; such sample...
The database is one of the benchmarks that affect the quality of information systems. An effective i...
textabstractTesting an SQL database system by running large sets of deterministic or stochastic SQL ...
peer-reviewedIn a wide range of application areas (e.g. data mining, approximate query evaluation, ...
Many steps are involved in the process of turning an initial concept for a database into a finished ...
Testing an SQL database system by running large sets of de-terministic or stochastic SQL statements ...
AbstractRecently, we have proposed an adaptive, random-sampling algorithm for general query size est...
We present an adaptive distributed query-sampling framework that is quality-conscious for extracting...
Database sampling is widely used in many database applications when, for eciency reasons, an entire ...
Managing large amounts of information is one of the most expensive, time-consuming and non-trivial a...
Managing large amounts of information is one of the most expensive, time-consuming and non-trivial a...
Abstract—In a wide range of application areas (e.g. data mining, approximate query evaluation, histo...
Populating the testing environment with relevant data represents a great challenge in software valid...
peer-reviewedDatabase sampling has become a popular approach to handle large amounts of data in a w...
In the wake of growing database that has already become the trend of today’s business environment wi...
Perhaps the most flexible synopsis of a database is a uniform random sample of the data; such sample...
The database is one of the benchmarks that affect the quality of information systems. An effective i...
textabstractTesting an SQL database system by running large sets of deterministic or stochastic SQL ...
peer-reviewedIn a wide range of application areas (e.g. data mining, approximate query evaluation, ...
Many steps are involved in the process of turning an initial concept for a database into a finished ...
Testing an SQL database system by running large sets of de-terministic or stochastic SQL statements ...
AbstractRecently, we have proposed an adaptive, random-sampling algorithm for general query size est...
We present an adaptive distributed query-sampling framework that is quality-conscious for extracting...