Users and administrators of large-scale infrastructures (e.g., datacenters and PlanetLab) are frequently in need of monitoring groups of machines in the infrastructure. Though there exist several distributed querying systems for this monitoring purpose, they are not group-based; they mostly focus on querying the entire system. In this paper, we present Moara, a new querying system that makes two novel contributions. First, Moara builds aggregation trees for different groups and adaptively maintains the trees to optimize the total message cost. Second, Moara supports a query language allowing groups to be specified implicitly via predicates consisting of arbitrarily nested unions and intersections. Our evaluations on Emulab, on PlanetLab, an...
The past few years have seen a major change in computing systems, as growing data volumes and stalli...
Continuous queries are used to monitor changes to time varying data and to provide results useful fo...
Distributed Data Stream Management Systems (DSMS) are increasingly used for the processing of high-r...
Users and administrators of large-scale infrastructures (e.g., datacenters and PlanetLab) are freque...
International audienceMapReduce model is a new parallel programming model initially developed for la...
Groupjoins, the combined execution of a join and a subsequent group by, are common in analytical que...
Data analysts need to understand the quality of data in the warehouse. This is often done by issuing...
Some aggregate and grouping queries are conceptually simple, but difficult to express in SQL. This d...
Typically a user desires to obtain the value of some aggregation function over distributed data item...
Sharing structured data in a PDMS is hard due to schema heterogeneity and peer autonomy. To overcome...
Continuous queries are persistent queries that allow users to receive new results when they become a...
The dream of computing power as readily available as the electricity in a wall socket is coming clos...
[[abstract]]Mermaid is a testbed system which provides integrated access to multiple databases. Two ...
While traditional database systems optimize for performance on one-shot query processing, emerging l...
Advanced analytics and other Big Data applications call for query languages that can express the com...
The past few years have seen a major change in computing systems, as growing data volumes and stalli...
Continuous queries are used to monitor changes to time varying data and to provide results useful fo...
Distributed Data Stream Management Systems (DSMS) are increasingly used for the processing of high-r...
Users and administrators of large-scale infrastructures (e.g., datacenters and PlanetLab) are freque...
International audienceMapReduce model is a new parallel programming model initially developed for la...
Groupjoins, the combined execution of a join and a subsequent group by, are common in analytical que...
Data analysts need to understand the quality of data in the warehouse. This is often done by issuing...
Some aggregate and grouping queries are conceptually simple, but difficult to express in SQL. This d...
Typically a user desires to obtain the value of some aggregation function over distributed data item...
Sharing structured data in a PDMS is hard due to schema heterogeneity and peer autonomy. To overcome...
Continuous queries are persistent queries that allow users to receive new results when they become a...
The dream of computing power as readily available as the electricity in a wall socket is coming clos...
[[abstract]]Mermaid is a testbed system which provides integrated access to multiple databases. Two ...
While traditional database systems optimize for performance on one-shot query processing, emerging l...
Advanced analytics and other Big Data applications call for query languages that can express the com...
The past few years have seen a major change in computing systems, as growing data volumes and stalli...
Continuous queries are used to monitor changes to time varying data and to provide results useful fo...
Distributed Data Stream Management Systems (DSMS) are increasingly used for the processing of high-r...