Governments, universities, and companies expend vast resources building the top supercomputers. The processors and interconnect networks become faster, while the number of nodes grows exponentially. Problems of scale emerge, not least of which is collective performance. This thesis identifies and proposes solutions for two major scalability problems. Our first contribution is a novel algorithm for process-partitioning and remapping for exascale systems that has far better time and space scaling than known algorithms. Our evaluations predict an improvement of up to 60x for large exascale systems and arbitrary reduction in the large temporary buffer space required for generating new communicators. Our second contribution consist...
Future manycore Systems-on-Chip will integrate tens or even hundreds of cores. Tiled architectures h...
In irregular all-to-all communication, messages are exchanged between every pair of processors. The ...
We describe our efforts to scale message driven applica-tions to a large number of processors on an ...
Governments, universities, and companies expend vast resources building the top supercomputers. The...
Supercomputers continue to expand both in size and complexity as we reach the beginning of the exasc...
Improving the performance of future computing systems will be based upon the ability of increasing t...
Artículo de publicación ISIIn this paper we study distributed algorithms on massive graphs where li...
Technology trends suggest that future machines will relyon parallelism to meet increasing performanc...
Technology trends suggest that future machines will rely on parallelism to meet increasing performan...
Network scalability has emerged as the essential problem in designing architectures and protocols fo...
The current trends in high performance computing show that large machines with tens of thousands of ...
pre-printThe placement of tasks in a parallel application on specific nodes of a supercomputer can s...
Collective communication allows efficient communication and synchronization among a collection of pr...
We report on a project to develop a unified approach for building a library of collective communicat...
This work presents and evaluates algorithms for MPI collective communication operations on high perf...
Future manycore Systems-on-Chip will integrate tens or even hundreds of cores. Tiled architectures h...
In irregular all-to-all communication, messages are exchanged between every pair of processors. The ...
We describe our efforts to scale message driven applica-tions to a large number of processors on an ...
Governments, universities, and companies expend vast resources building the top supercomputers. The...
Supercomputers continue to expand both in size and complexity as we reach the beginning of the exasc...
Improving the performance of future computing systems will be based upon the ability of increasing t...
Artículo de publicación ISIIn this paper we study distributed algorithms on massive graphs where li...
Technology trends suggest that future machines will relyon parallelism to meet increasing performanc...
Technology trends suggest that future machines will rely on parallelism to meet increasing performan...
Network scalability has emerged as the essential problem in designing architectures and protocols fo...
The current trends in high performance computing show that large machines with tens of thousands of ...
pre-printThe placement of tasks in a parallel application on specific nodes of a supercomputer can s...
Collective communication allows efficient communication and synchronization among a collection of pr...
We report on a project to develop a unified approach for building a library of collective communicat...
This work presents and evaluates algorithms for MPI collective communication operations on high perf...
Future manycore Systems-on-Chip will integrate tens or even hundreds of cores. Tiled architectures h...
In irregular all-to-all communication, messages are exchanged between every pair of processors. The ...
We describe our efforts to scale message driven applica-tions to a large number of processors on an ...