Topology aware mapping has started to attain interest again by the development of supercomputers whose topologies consist of thousands of processors with large diameters. In such parallel architectures, it is possible to obtain performance improvements for the executed parallel programs via careful mapping of tasks to processors by considering properties of the underlying topology and the communication pattern of the mapped program. One of the most widely used metric for capturing a parallel program’s communication overhead is the hop-bytes metric which takes the processor topology into account which is in contrast to the assumptions made by the wormhole routing. In this work, we propose a KL-based iterative improvement heuristic for mappin...
International audienceThis paper is devoted to mapping iterative algorithms onto heterogeneous clust...
The optimal mapping of tasks of a parallel program onto nodes of a parallel computing system has a r...
This paper presents a parallel simulated annealing algorithm for solving the problem of mapping irre...
Topology aware mapping has started to attain interest again by the development of supercomputers who...
Abhinav Bhatele, Ph.D. student at the Parallel Programming Lab at the University of Illinois present...
166 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2010.Performance improvements thro...
International audienceConsidering the large number of processors and the size of the interconnection...
Obtaining the best performance from a parallel program involves four important steps: 1. Choice of t...
Assignment of tasks of a parallel program onto processors of a distributed-memory system is critical...
International audienceDue to the advent of modern hardware architectures of high-performance comput-...
A fundamental issue affecting the performance of a parallel application running on message-passing p...
The task-to-processor mapping problem is addressed in the context of a local-memory multiprocessor w...
This paper presents a parallel simulated annealing algorithm for solving the problem of mapping irre...
The orchestration of communication of distributed memory parallel applications on a parallel compute...
To execute a parallel program on a multicomputer system, the tasks of the program have to be mapped ...
International audienceThis paper is devoted to mapping iterative algorithms onto heterogeneous clust...
The optimal mapping of tasks of a parallel program onto nodes of a parallel computing system has a r...
This paper presents a parallel simulated annealing algorithm for solving the problem of mapping irre...
Topology aware mapping has started to attain interest again by the development of supercomputers who...
Abhinav Bhatele, Ph.D. student at the Parallel Programming Lab at the University of Illinois present...
166 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2010.Performance improvements thro...
International audienceConsidering the large number of processors and the size of the interconnection...
Obtaining the best performance from a parallel program involves four important steps: 1. Choice of t...
Assignment of tasks of a parallel program onto processors of a distributed-memory system is critical...
International audienceDue to the advent of modern hardware architectures of high-performance comput-...
A fundamental issue affecting the performance of a parallel application running on message-passing p...
The task-to-processor mapping problem is addressed in the context of a local-memory multiprocessor w...
This paper presents a parallel simulated annealing algorithm for solving the problem of mapping irre...
The orchestration of communication of distributed memory parallel applications on a parallel compute...
To execute a parallel program on a multicomputer system, the tasks of the program have to be mapped ...
International audienceThis paper is devoted to mapping iterative algorithms onto heterogeneous clust...
The optimal mapping of tasks of a parallel program onto nodes of a parallel computing system has a r...
This paper presents a parallel simulated annealing algorithm for solving the problem of mapping irre...