Managing large amounts of information is one of the most expensive, time-consuming and non-trivial activities and it usually requires expert knowledge. In a wide range of ap-plication areas, such as data mining, histogram construc-tion, approximate query evaluation, and software validation, handling exponentially growing databases has become a dif-ficult challenge, and a subset of the data is generally pre-ferred. As a solution to the current challenges in managing large amounts of data, database sampling from the opera-tional data available has proved to be a powerful technique. However, none of the existing sampling approaches consider the dependencies between the data in a relational database. In this paper, we propose a novel approach t...
Approximate query processing is an adequate technique to reduce response times and system load in ca...
textabstractTesting an SQL database system by running large sets of deterministic or stochastic SQL ...
One response to the proliferation of large datasets has been to develop ingenious ways to throw reso...
Managing large amounts of information is one of the most expensive, time-consuming and non-trivial a...
Populating the testing environment with relevant data represents a great challenge in software valid...
peer-reviewedDatabase sampling has become a popular approach to handle large amounts of data in a w...
Database sampling is widely used in many database ap-plications when, for efficiency reasons, an ent...
Database sampling is widely used in many database applications when, for eciency reasons, an entire ...
Abstract—In a wide range of application areas (e.g. data mining, approximate query evaluation, histo...
Abstract: The Information Era has witnessed a huge number of sources from websites. The abundance of...
AbstractRecently, we have proposed an adaptive, random-sampling algorithm for general query size est...
International audienceGenerating synthetic data is useful in multiple application areas (e.g., datab...
Data mining is an emerging research area, whose goal is to extract significant patterns or interesti...
In the wake of growing database that has already become the trend of today’s business environment wi...
Perhaps the most flexible synopsis of a database is a uniform random sample of the data; such sample...
Approximate query processing is an adequate technique to reduce response times and system load in ca...
textabstractTesting an SQL database system by running large sets of deterministic or stochastic SQL ...
One response to the proliferation of large datasets has been to develop ingenious ways to throw reso...
Managing large amounts of information is one of the most expensive, time-consuming and non-trivial a...
Populating the testing environment with relevant data represents a great challenge in software valid...
peer-reviewedDatabase sampling has become a popular approach to handle large amounts of data in a w...
Database sampling is widely used in many database ap-plications when, for efficiency reasons, an ent...
Database sampling is widely used in many database applications when, for eciency reasons, an entire ...
Abstract—In a wide range of application areas (e.g. data mining, approximate query evaluation, histo...
Abstract: The Information Era has witnessed a huge number of sources from websites. The abundance of...
AbstractRecently, we have proposed an adaptive, random-sampling algorithm for general query size est...
International audienceGenerating synthetic data is useful in multiple application areas (e.g., datab...
Data mining is an emerging research area, whose goal is to extract significant patterns or interesti...
In the wake of growing database that has already become the trend of today’s business environment wi...
Perhaps the most flexible synopsis of a database is a uniform random sample of the data; such sample...
Approximate query processing is an adequate technique to reduce response times and system load in ca...
textabstractTesting an SQL database system by running large sets of deterministic or stochastic SQL ...
One response to the proliferation of large datasets has been to develop ingenious ways to throw reso...