Abstract—Collective communications are ubiquitous in parallel applications. We present two new algorithms for performing a reduction. The operation associated with our reduction needs to be associative and commutative. The two algorithms are developed under two different communication models (unidirectional and bidirectional). Both algorithms use a greedy scheduling scheme. For a unidirectional, fully connected network, we prove that our greedy algorithm is optimal when some realistic assumptions are respected. Previous algorithms fit the same assumptions and are only appropriate for some given configurations. Our algorithm is optimal for all configurations. We note that there are some configuration where our greedy algorithm significantly ...
AbstractWe study the effect of limited communication throughput on parallel computation in a setting...
AbstractKarp and Zhang developed a general randomized parallel algorithm for solving branch and boun...
Reconfiguration is largely an unexplored property in the context of parallel models of computation. ...
International audienceReduction is a core operation in parallel computing that combines distributed ...
In many distributed-memory parallel computers the only built-in communication primitive is point-to-...
We present a new, simple algorithmic idea for the collective communication oper-ations broadcast, re...
Many parallel algorithms exhibit a hypercube communication topology. Such algorithms can easily be e...
In many distributed-memory parallel computers the only built-in communication primitive is point-to-...
. In this paper, we present a method for overlapping communications on parallel computers for pipeli...
Pipelining is normally associated with shared memory and vector computers and rarely used as an algo...
[[abstract]]Some common guidelines that can be used to design parallel algorithms under the single-c...
Greedy algorithms are practitioners ’ best friends—they are intu-itive, simple to implement, and oft...
We study the greedy algorithm for delivering messages with deadlines in synchronous networks. The pr...
Abstract. We present a new, simple algorithmic idea for exploiting the potential for bidirectional c...
We study the greedy algorithm for delivering messages with deadline in synchronous networks. The pro...
AbstractWe study the effect of limited communication throughput on parallel computation in a setting...
AbstractKarp and Zhang developed a general randomized parallel algorithm for solving branch and boun...
Reconfiguration is largely an unexplored property in the context of parallel models of computation. ...
International audienceReduction is a core operation in parallel computing that combines distributed ...
In many distributed-memory parallel computers the only built-in communication primitive is point-to-...
We present a new, simple algorithmic idea for the collective communication oper-ations broadcast, re...
Many parallel algorithms exhibit a hypercube communication topology. Such algorithms can easily be e...
In many distributed-memory parallel computers the only built-in communication primitive is point-to-...
. In this paper, we present a method for overlapping communications on parallel computers for pipeli...
Pipelining is normally associated with shared memory and vector computers and rarely used as an algo...
[[abstract]]Some common guidelines that can be used to design parallel algorithms under the single-c...
Greedy algorithms are practitioners ’ best friends—they are intu-itive, simple to implement, and oft...
We study the greedy algorithm for delivering messages with deadlines in synchronous networks. The pr...
Abstract. We present a new, simple algorithmic idea for exploiting the potential for bidirectional c...
We study the greedy algorithm for delivering messages with deadline in synchronous networks. The pro...
AbstractWe study the effect of limited communication throughput on parallel computation in a setting...
AbstractKarp and Zhang developed a general randomized parallel algorithm for solving branch and boun...
Reconfiguration is largely an unexplored property in the context of parallel models of computation. ...