Scalable execution of continuous queries over massive data streams often requires splitting input streams into parallel sub-streams over which query operators are executed in parallel. Automatic stream splitting is in general very difficult, as the optimal parallelization may depend on application semantics. To enable application specific stream splitting, we introduce splitstream functions where the user specifies non-procedural stream partitioning and replication. For high-volume streams, the stream splitting itself becomes a performance bottleneck. A cost model is introduced that estimates the performance of splitstream functions with respect to throughput and CPU usage. We implement parallel splitstream functions, and relate experimenta...
The inherently large and varying volumes of information generated in large scale systems demand near...
International audienceNowadays, more and more sources (connected devices, social networks, etc.) emi...
Data stream management systems (DSMSs) are scalable, highly available, and fault-tolerant systems th...
Scalable execution of continuous queries over massive data streams often requires splitting input st...
Numerous applications in for example science, engineering, and financial analysis increasingly requi...
Abstract—Data streaming has become an important paradigm for the real-time processing of continuous ...
Cataloged from PDF version of article.In this paper we study partitioning functions for stream proc...
This article addresses the profitability problem associated with auto-parallelization of general-pur...
In this paper, we study partitioning functions for stream processing systems that employ stateful da...
Distributed Data Stream Management Systems (DSMS) are increasingly used for the processing of high-r...
More and more use cases require fast, accurate, and reliable processing of large volumes of data. To...
Continuous queries over data streams typically produce large volumes of result streams. To scale up ...
Existing distributed stream systems adopt a tightly-coupled communication paradigm and focus on fine...
Stream reasoning is an emerging research area focused on providing continuous reasoning solutions fo...
Streaming applications transform possibly infinite streams of data and often have both high throughp...
The inherently large and varying volumes of information generated in large scale systems demand near...
International audienceNowadays, more and more sources (connected devices, social networks, etc.) emi...
Data stream management systems (DSMSs) are scalable, highly available, and fault-tolerant systems th...
Scalable execution of continuous queries over massive data streams often requires splitting input st...
Numerous applications in for example science, engineering, and financial analysis increasingly requi...
Abstract—Data streaming has become an important paradigm for the real-time processing of continuous ...
Cataloged from PDF version of article.In this paper we study partitioning functions for stream proc...
This article addresses the profitability problem associated with auto-parallelization of general-pur...
In this paper, we study partitioning functions for stream processing systems that employ stateful da...
Distributed Data Stream Management Systems (DSMS) are increasingly used for the processing of high-r...
More and more use cases require fast, accurate, and reliable processing of large volumes of data. To...
Continuous queries over data streams typically produce large volumes of result streams. To scale up ...
Existing distributed stream systems adopt a tightly-coupled communication paradigm and focus on fine...
Stream reasoning is an emerging research area focused on providing continuous reasoning solutions fo...
Streaming applications transform possibly infinite streams of data and often have both high throughp...
The inherently large and varying volumes of information generated in large scale systems demand near...
International audienceNowadays, more and more sources (connected devices, social networks, etc.) emi...
Data stream management systems (DSMSs) are scalable, highly available, and fault-tolerant systems th...