Distributed graph processing systems largely rely on proactive techniques for failure recovery. Unfortunately, these approaches (such as checkpointing) entail a significant overhead. In this paper, we argue that distributed graph processing systems should instead use a reactive approach to failure recovery. The reactive approach trades off completeness of the result (generating a slightly inaccurate result) while reducing the overhead during failure-free execution to zero. We build a system called Zorro that imbues this reactive approach, and integrate Zorro into two graph processing systems – PowerGraph and LFGraph. When a failure occurs, Zorro opportunistically exploits vertex replication (inherent in today’s graph processing systems) to ...
Distributed software systems have become the backbone of Internet services. Failures in pro-duction ...
In this work we have addressed the complex problem of recovery for concurrent failures in a distribu...
Large, production quality distributed systems still fail pe-riodically, and do so sometimes catastro...
Distributed graph processing systems largely rely on proac-tive techniques for failure recovery. Unf...
Distributed graph processing systems largely rely on proac-tive techniques for failure recovery. Unf...
Distributed graph processing frameworks have become increasingly popular for processing large graphs...
Distributed graph processing frameworks have become increasingly popular for processing large graphs...
Distributed graph processing systems are an emerging area of big data systems. As graphs continue to...
Distributed graph processing systems increasingly require many compute nodes to cope with the requir...
Real-world graph processing applications often require combining the graph data with tabular data. M...
While various iterative graph algorithms can be expressed via asynchronous parallelism, lack of its ...
In contrast to conventional (trans)action concepts, the proposed dynamic action model includes the p...
Traditionally, fault-tolerant systems assume that failures are independent, often expressed as a thr...
The amount of data generated every day is growing exponentially in the big data era. A significant p...
Large-scale graph and machine learning analytics widely employ distributed iterative processing. Typ...
Distributed software systems have become the backbone of Internet services. Failures in pro-duction ...
In this work we have addressed the complex problem of recovery for concurrent failures in a distribu...
Large, production quality distributed systems still fail pe-riodically, and do so sometimes catastro...
Distributed graph processing systems largely rely on proac-tive techniques for failure recovery. Unf...
Distributed graph processing systems largely rely on proac-tive techniques for failure recovery. Unf...
Distributed graph processing frameworks have become increasingly popular for processing large graphs...
Distributed graph processing frameworks have become increasingly popular for processing large graphs...
Distributed graph processing systems are an emerging area of big data systems. As graphs continue to...
Distributed graph processing systems increasingly require many compute nodes to cope with the requir...
Real-world graph processing applications often require combining the graph data with tabular data. M...
While various iterative graph algorithms can be expressed via asynchronous parallelism, lack of its ...
In contrast to conventional (trans)action concepts, the proposed dynamic action model includes the p...
Traditionally, fault-tolerant systems assume that failures are independent, often expressed as a thr...
The amount of data generated every day is growing exponentially in the big data era. A significant p...
Large-scale graph and machine learning analytics widely employ distributed iterative processing. Typ...
Distributed software systems have become the backbone of Internet services. Failures in pro-duction ...
In this work we have addressed the complex problem of recovery for concurrent failures in a distribu...
Large, production quality distributed systems still fail pe-riodically, and do so sometimes catastro...