Lack of knowledge in the underlying data distribution in distributed large-scale data can be an obstacle when issuing analytics & predictive modelling queries. Analysts find themselves having a hard time finding analytics/exploration queries that satisfy their needs. In this paper, we study how exploration query results can be predicted in order to avoid the execution of ‘bad’/non-informative queries that waste network, storage, financial resources, and time in a distributed computing environment. The proposed methodology involves clustering of a training set of exploration queries along with the cardinality of the results (score) they retrieved and then using query-centroid representatives to proceed with predictions. After the training ph...
Large organizations have seamlessly incorporated data-driven decision making in their operations. Ho...
Industry 4.0 is coming into the industry by storm. The leveraging of the data that is already presen...
Distance-based nearest neighbours (dNN) queries and aggregations over their answer sets are importan...
Lack of knowledge in the underlying data distribution in distributed large-scale data can be an obst...
Fundamental to many predictive analytics tasks is the ability to estimate the cardinality (number of...
We introduce a predictive modeling solution that provides high quality predictive analytics over agg...
We introduce a predictive modeling solution that provides high quality predictive analytics over agg...
We study a novel solution to executing aggregation (and specifically COUNT) queries over large-scale...
Nowadays, the increased amount of users' devices produce huge volumes of data that should be efficie...
We study a novel solution to executing aggregation (and specifically COUNT) queries over large-scal...
The digitization of our lives cause a shift in the data production as well as in the required data m...
The digitization of our lives cause a shift in the data production as well as in the required data m...
International audienceIn Business Intelligence systems, users interact with data warehouses by formu...
Inductive databases tightly integrate databases with data mining. Besides data, an inductive databas...
Large organizations have seamlessly incorporated data-driven decision making in their operations. Ho...
Large organizations have seamlessly incorporated data-driven decision making in their operations. Ho...
Industry 4.0 is coming into the industry by storm. The leveraging of the data that is already presen...
Distance-based nearest neighbours (dNN) queries and aggregations over their answer sets are importan...
Lack of knowledge in the underlying data distribution in distributed large-scale data can be an obst...
Fundamental to many predictive analytics tasks is the ability to estimate the cardinality (number of...
We introduce a predictive modeling solution that provides high quality predictive analytics over agg...
We introduce a predictive modeling solution that provides high quality predictive analytics over agg...
We study a novel solution to executing aggregation (and specifically COUNT) queries over large-scale...
Nowadays, the increased amount of users' devices produce huge volumes of data that should be efficie...
We study a novel solution to executing aggregation (and specifically COUNT) queries over large-scal...
The digitization of our lives cause a shift in the data production as well as in the required data m...
The digitization of our lives cause a shift in the data production as well as in the required data m...
International audienceIn Business Intelligence systems, users interact with data warehouses by formu...
Inductive databases tightly integrate databases with data mining. Besides data, an inductive databas...
Large organizations have seamlessly incorporated data-driven decision making in their operations. Ho...
Large organizations have seamlessly incorporated data-driven decision making in their operations. Ho...
Industry 4.0 is coming into the industry by storm. The leveraging of the data that is already presen...
Distance-based nearest neighbours (dNN) queries and aggregations over their answer sets are importan...