Opening the Black

Publication date

September 2015

Abstract

Many systems for big data analytics employ a data flow abstrac-tion to define parallel data processing tasks. In this setting, custom operations expressed as user-defined functions are very common. We address the problem of performing data flow optimization at this level of abstraction, where the semantics of operators are not known. Traditionally, query optimization is applied to queries with known algebraic semantics. In this work, we find that a handful of properties, rather than a full algebraic specification, suffice to establish reordering conditions for data processing operators. We show that these properties can be accurately estimated for black box operators by statically analyzing the general-purpose code of their user-defined fun...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Opening the Black

Abstract

Extracted data

Opening the Black

Abstract

Extracted data

Related items

Related items