Verschachtelte Parallelität und Kontrollfluss in Big-Data-Analysesystemen

Gévay, Gábor Etele

Publication date

June 2022

DOI

Abstract

Over the last 15 years, numerous distributed dataflow systems appeared for large-scale data analytics, such as Apache Flink and Apache Spark. Users of such systems write data analysis programs in a (more or less) high-level API, while the systems take care of the low-level details of executing the programs in a scalable way on a cluster of machines. The systems' APIs consist of distributed collection types (or distributed matrix, graph, etc. types), and corresponding parallel operations. Distributed dataflow systems work well for simple programs, which are straightforward to express by just a few of the system-provided parallel operations. However, modern data analytics often demands the composition of larger programs, where 1) parallel op...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Verschachtelte Parallelität und Kontrollfluss in Big-Data-Analysesystemen

Abstract

Extracted data

Verschachtelte Parallelität und Kontrollfluss in Big-Data-Analysesystemen

Abstract

Extracted data

Topics

Related items

Topics

Related items