Distributed stream processing frameworks have gained widespread adoption in the last decade because they abstract away the complexity of parallel processing. One of their key features is built-in fault tolerance. In this work, we dive deeper into the implementation, performance, and efficiency of this critical feature for four state-of-the-art frameworks. We include the established Spark Streaming and Flink frameworks and the more novel Spark Structured Streaming and Kafka Streams frameworks. We test the behavior under different types of faults and settings: master failure with and without high-availability setups, driver failures for Spark frameworks, worker failure with or without exactly-once semantics, application and task failures. We ...
As the amount of data being exchanged over the network increases, algorithms originally implemented ...
This electronic version was submitted by the student author. The certified thesis is available in th...
Abstract This paper considers the problem of supporting and efciently implementing fault-tolerance ...
Distributed stream processing frameworks have gained widespread adoption in the last decade because ...
Distributed stream processing frameworks have gained widespread adoption in the last decade because ...
Major advances in the fault tolerance of distributed stream processing systems provided the systems ...
Fault tolerance is a key requirement in large-scale distributed stream processing engines (SPEs), es...
Stream processing emerged as a paradigm to continuously process incoming live data streams, such as ...
143 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2010.Stream processing emerged as ...
We present a replication-based approach to fault-tolerant distributed stream processing in the face ...
Stream-processing systems are designed to support an emerging class of applications that require sop...
Event Stream Processing (ESP) is a well-established approach for low-latency data processing enablin...
Stream-processing systems are designed to support an emerging class of applications that require sop...
Fault tolerance is a key requirement in large-scale distributed stream processing engines (SPEs), es...
Stream processing emerged as a paradigm to continuously process incoming live data streams, such as ...
As the amount of data being exchanged over the network increases, algorithms originally implemented ...
This electronic version was submitted by the student author. The certified thesis is available in th...
Abstract This paper considers the problem of supporting and efciently implementing fault-tolerance ...
Distributed stream processing frameworks have gained widespread adoption in the last decade because ...
Distributed stream processing frameworks have gained widespread adoption in the last decade because ...
Major advances in the fault tolerance of distributed stream processing systems provided the systems ...
Fault tolerance is a key requirement in large-scale distributed stream processing engines (SPEs), es...
Stream processing emerged as a paradigm to continuously process incoming live data streams, such as ...
143 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2010.Stream processing emerged as ...
We present a replication-based approach to fault-tolerant distributed stream processing in the face ...
Stream-processing systems are designed to support an emerging class of applications that require sop...
Event Stream Processing (ESP) is a well-established approach for low-latency data processing enablin...
Stream-processing systems are designed to support an emerging class of applications that require sop...
Fault tolerance is a key requirement in large-scale distributed stream processing engines (SPEs), es...
Stream processing emerged as a paradigm to continuously process incoming live data streams, such as ...
As the amount of data being exchanged over the network increases, algorithms originally implemented ...
This electronic version was submitted by the student author. The certified thesis is available in th...
Abstract This paper considers the problem of supporting and efciently implementing fault-tolerance ...