Optimization

Christopher Rodrigues
Thomas Jablin
Abdul Dakkak
Wen-mei Hwu

Publication date

October 2015

Abstract

Functional algorithmic skeletons promise a high-level pro-gramming interface for distributed-memory clusters that free developers from concerns of task decomposition, schedul-ing, and communication. Unfortunately, prior distributed functional skeleton frameworks do not deliver performance comparable to that achievable in a low-level distributed pro-gramming model such as C with MPI and OpenMP, even when used in concert with high-performance array libraries. There are several causes: they do not take advantage of shared memory on each cluster node; they impose a xed partitioning strategy on input data; and they have limited ability to fuse loops involving skeletons that produce a vari-able number of outputs per input. We address these shortc...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Optimization

Abstract

Extracted data

Optimization

Abstract

Extracted data

Related items

Related items