Making State Explicit for Imperative Big Data Processing

Publication date

November 2015

Abstract

Data scientists often implement machine learning algo-rithms in imperative languages such as Java, Matlab and R. Yet such implementations fail to achieve the per-formance and scalability of specialised data-parallel pro-cessing frameworks. Our goal is to execute impera-tive Java programs in a data-parallel fashion with high throughput and low latency. This raises two challenges: how to support the arbitrary mutable state of Java pro-grams without compromising scalability, and how to re-cover that state after failure with low overhead. Our idea is to infer the dataflow and the types of state accesses from a Java program and use this information to generate a stateful dataflow graph (SDG). By explic-itly separating data from mutable state, SD...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Making State Explicit for Imperative Big Data Processing

Abstract

Extracted data

Making State Explicit for Imperative Big Data Processing

Abstract

Extracted data

Related items

Related items