In this document, I explore the problem of scheduling pipelined applications onto large-scale distributed platforms, in order to optimize several criteria. A particular attention is given to throughput maximization (i.e., the number of data sets that can be processed every time unit), latency minimization (i.e., the time required to process one data set entirely), and failure probability minimization. First, I accurately define the models and the scheduling problems, and exhibit surprising results, such as the difficulty to compute the optimal throughput and/or latency that can be obtained given a mapping. In particular, I detail the importance of the communication models, which induce quite different levels of difficulty. Second, I give an...